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Abstract 

Recently, it has been recognized that phase transitions play an important role in the 
probabilistic analysis of combinatorial optimization problems. However, there are 
in fact many other relations that lead to close ties between computer science and 
statistical physics. This review aims at presenting the tools and concepts designed 
by physicists to deal with optimization or decision problems in an accessible lan- 
guage for computer scientists and mathematicians, with no prerequisites in physics. 
We first introduce some elementary methods of statistical mechanics and then pro- 
gressively cover the tools appropriate for disordered systems. In each case, we apply 
these methods to study the phase transitions or the statistical properties of the 
optimal solutions in various combinatorial problems. We cover in detail the Ran- 
dom Graph, the Satisfiability, and the Traveling Salesman problems. References to 
the physics literature on optimization are provided. We also give our perspective 
regarding the interdisciplinary contribution of physics to computer science. 
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1 Introduction 



At the heart of statistical physics, discrete mathematics, and theoretical com- 
puter science, lie mathematically similar counting and optimization problems. 
This situation leads to a transgression of boundaries so that progress in one 
discipline can benefit the others. An old example of this is the work of Kaste- 
leyn (a physicist) who introduced a method for counting perfect matchings 
over planar graphs (a discrete mathematics problem). Our belief is that a 
similar cross-fertilization of methods and models should arise in the study 
of combinatorial problems over random structures. Such problems have at- 
tracted the attention of a large community of researcher in the last decade, 
but a transgression of boundaries has only just begun. One of the many po- 
tential spin-offs of this kind of cross-fertilization would be the use of computer 
science and graph theoretical methods to tackle unsolved problems in the sta- 
tistical physics of "complex" (disordered) systems. But we also hope that the 
benefits can go the other way, z.e., that the recent developments in statistical 
physics may be of use to the other two communities; such is our motivation 
for this article. 

This review does not assume any knowledge in physics, and thus we expect 
it to be accessible to mathematicians and computer scientists eager to learn 
the main ideas and tools of statistical physics when applied to random com- 
binatorics. We have chosen to illustrate these "physical" approaches on three 
problems: the Random Graph, the Satisfiabihty, and the Traveling Salesman 
problems. This particular focus should help the interested reader explore the 
statistical physics literature on decision and optimization problems. Further- 
more, we hope to make the case that these methods, developed during the last 
twenty years in the context of the so called spin glass theory [1,2], may provide 
new concepts and results in the study of phase transitions., and average case 
computational complexity, in computer science problems. Some examples of 
this kind of methodological transfer can also be found in three other papers 
of this TCS special issue, dealing with statistical mechanics analyses of ver- 
tex covering on random graphs [3] , of number partitioning [4] and of learning 
theory in artificial neural networks [5]. 

Random combinatorics became a central part of graph theory following the 
pioneering work by Erdos and Renyi. Their study of clusters in random graphs 
(percolation for physicists) showed the existence of zero-one laws (phase tran- 
sitions in the terminology of physics). More recently, such phenomena have 
played a fundamental role when tackling average-case complexity. Indeed, 
numerical evidence suggests that the onset of intractability in random NP- 
complete problems can be put in relation with the appearance of phase tran- 
sitions analogous to the percolation transition. Interestingly, the concept of 
random structures is present in most natural sciences, including biology, chem- 
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istry, or physics. But in the last two decades, the theoretical framework de- 
veloped in physics has lead to new analytical and numerical tools that can 
be shared with the more mathematical disciplines. The potential connections 
between discrete mathematics, theoretical computer science and statistical 
physics become particularly obvious when one considers the typical properties 
of random systems. In such cases, percolation, zero-one laws, or phase tran- 
sitions are simply different names describing the same phenomena within the 
different disciplines. It seems to us that much can be gained by exploring the 
complementary nature of the different paradigms in mathematics and physics. 
In what follows, we shall try to make this happen by giving a thorough sta- 
tistical mechanics analysis of three prototype problems, namely percolation 
in random graphs, satisfiability in random K-Satisfiability, and optimization 
via the Traveling Salesman Problem. The review is preceded by a general dis- 
cussion of some basic concepts and tools of statistical mechanics. We have 
also included simple exercises to help the interested reader become familiar 
with the methodology; hopefully he (she) will be able to adapt it to the study 
of many other problems, e.g., matching, number partitioning [4], etc... When 
appropriate, we compare the results of statistical physics to those of discrete 
mathematics and computer science. 

Prom a statistical mechanics perspective, a phase transition is nothing but the 
onset of non-trivial macroscopic (collective) behavior in a system composed 
of a large number of "elements" that follow simple microscopic laws. The 
analogy with random graphs is straightforward. There the elements are the 
edges of the graph which are added at random at each time step and the 
macroscopic phenomenon is the appearance of a connected component of the 
graph containing a finite fraction of all the vertices, in the limit of a very 
large number of vertices. If a system has a phase transition, it can be in 
one of several "phases" , depending on the values of some control parameters. 
Each phase is characterized by a different microscopic organization. Central 
to this characterization is the identification of an order parameter (usually the 
expectation value of a microscopic quantity) which discriminates between the 
different phases. Once again the analogy with random graphs is appropriate. 
An order parameter of the percolation transition is the fraction of vertices 
belonging to the giant connected component. Such a fraction is zero below the 
percolation transition, that is, when the connectivity of the random graph is 
too small, and becomes strictly positive beyond the percolation threshold. 

While in percolation it is proven that the order parameter is indeed the frac- 
tion of vertices belonging to the infinite giant component, in more complicated 
systems the determination of an order parameter is generally an open prob- 
lem. Though not rigourous, statistical mechanics provides numerous specific 
methods for identifying and studying order parameters, and we shall illustrate 
this on the K-Satisfiability problem. This step is useful of course for providing 
a good intuitive view of the system's behavior, but more importantly it also 
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gives information on the microscopic structure of the phases, information that 
can be used both in deriving analytical results and in interpreting numerical 
simulations. 

The way physicists and mathematicians proceed is quite different. Theoretical 

physicists generally do not prove theorems, rather they attempt to understand 
problems by obtaining exact and approximate results based on reasonable hy- 
potheses. In practice, these hypotheses are "validated" a posteriori through 
comparison with experiments or numerical simulations, and through consis- 
tency with the overall body of knowledge in physics. In this sense, theoretical 
physics must be distinguished from mathematical physics whose scope is to 
make rigorous statements. Of course, exact solutions play an important role 
in statistical physics in that they represent limiting cases where analytical or 
numerical techniques can be checked, but they are not the main focus of this 
discipline. 

For the sake of brevity we left out from this review some very relevant and 
closely connected topics such as exact enumeration methods [6] or applications 
of computer science algorithms to the study of two dimensional complex phys- 
ical systems [7,8]. Furthermore we do not claim to present a complete picture 
of what has been done by physicists on decision and optimization problems. 
Rather, we hope that what we do present will enable readers from the more 
mathematical disciplines to understand in detail the majority of what has been 
done by physicists using the methods of statistical mechanics. 



2 Elements of Statistical Physics 

In this section, the reader will be introduced to the basic notions of statis- 
tical mechanics. We start by illustrating on various examples the existence 
of phases and phase transitions, ubiquitous in physics and more surprisingly 
in other fields of science too. The concepts of microscopic and macroscopic 
levels of description naturally appear and allow for a rapid presentation of 
the foundations of statistical mechanics. We then expose in greater detail 
the combinatorial interpretation of statistical mechanics and introduce some 
key vocabulary and definitions. An accurate investigation of the properties 
of the so-called Ising model on the complete graph T^jy exemplifies the above 
concepts and calculation techniques. In order to bridge the gap with optimiza- 
tion problems, we then turn to the crucial issue of randomness and present 
appropriate analytical techniques to deal with random structures, e.g., the 
celebrated replica method. 

This section has been elaborated for a non physicist readers and we stress 
that no a priori knowledge of statistical mechanics is required. Exercises have 
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been included to illustrate key notions and should help the reader to acquire 
a deeper understanding of concepts and techniques. Solutions are sketched in 
Appendix A. Excellent presentations of statistical mechanics can be found in 
textbooks e.g. [9-11] for readers wanting further details. 



2.1 Phases and transitions 

Many physical compounds can exist in nature as distinct "states", called 
phases, depending on the values of control parameters, such as temperature, 

pressure, ... The change of phase happens very abruptly at some precise val- 
ues of the parameters and is called transition. We list below a few well-known 
examples from condensed matter physics as well as two cases coming from 
biology and computer science. 

2.1.1 Liquid-gas transition. 

At atmospheric pressure water boils at a "critical" temperature Tc — 100°C. 
When the temperature T is lower than water is a liquid while above Tc it is 
a gas. At the critical temperature Tc, a coexistence between the liquid and gas 
phases is possible: the fraction of liquid water depends only on the total volume 
occupied by both phases. The coexistence of the two phases at criticality is an 
essential feature of the liquid-gas transition. Transitions sharing this property 
are called first order phase transitions for mathematical reasons exposed later. 

2.1.2 Ferromagnetic-paramagnetic transition. 

It is well-known that magnets attract nails made out of iron. The magnetic 
field produced by the magnet induces some strong internal magnetization in 
the nail resulting in an attractive force. Materials behaving as iron are re- 
ferred to as ferromagnetic. However, the attractive force disappears when the 
temperature of the nail is raised above Tc — 770°C. The nail then enters the 
paramagnetic phase where the net magnetization vanishes. There is no phase 
coexistence at the critical temperature; the transition is said to be of second 
order. 

The ferromagnetic-paramagnetic transition temperature Tc varies consider- 
ably with the material under consideration. For instance, = 1115°C for 

cobalt, Tc = 454°C for nickel and Tc = 585°C for magnetite (Fe304). However, 
remarkably, it turns out that some other quantities - the critical exponents 
related to the (drastic) changes of physical properties at or close to the transi- 
tion - are equal for a large class of materials! The discovery of such universality 
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was a breakthrough and led to very deep theoretical developments in modern 
physics. Universality is characteristic of second order phase transitions. 

2.1.3 Conductor-superconductor transition. 

Good conductors such as copper are used to make electric wires because of 
their weak resistance to electric currents at room temperature. As the temper- 
ature is lowered, electrical resistance generally decreases smoothly as coUisions 
between electrons and vibrations of the metallic crystal become weaker and 
weaker. In 1911, Kammerling Onnes observed that the electrical resistance of 
a sample of mercury fell abruptly down to zero as temperature passed through 
Tc ~ 4:.2°K {0°K being the absolute zero of the Kelvin scale.) This change 
of state, between a normal conductor (finite resistance) and a superconductor 
(zero resistance) is a true phase transition: a very small variation of tempera- 
ture at Tc is enough to change resistance by four or five orders of magnitude! 

2.1.4 DNA denaturation transition. 

In physiological conditions, DNA has the double helix structure discovered by 
Watson and Crick in 1953. The two strands carry complementary sequences 
of A, T, G or C bases and are intertwined, forming either A-T or G-C pairs. 
Bases in a pair are attached together by hydrogen bonds. As the temperature is 
raised or ionic conditions are appropriately modified, bonds weaken and break 
up. The strands may then separate so that the double helix structure is lost: 
the DNA is denatured. This transition is abrupt on repeated homogeneous 
DNA sequences [12] . 

Recent micromanipulation experiments on individual DNA molecules have 
shown that denaturation can also be obtained through a mechanical action on 
DNA. When imposing a sufficient torque to the molecule to unwind the double 
helix, the latter opens up and DNA denaturates. At a fixed critical torque, 
denaturated and double helix regions may coexist along the same molecule[13] 
so this transition is like a liquid-gas one. 

2.1.5 Transition in the random K- Satisfiability problem. 

Computer scientists discovered some years ago that the random K-Satisfiability 
problem exhibits a threshold phenomenon as the ratio a of the number of 
clauses (M) over the number of Boolean variables (N) crosses a critical value 
ac{K) depending on the number of literals per clause K. When a is smaller 
than the threshold ac{K), a randomly drawn formula is almost surely satis- 
fiable while, above threshold, it is unsatisfiable with probability reaching one 
in the N ^ 00 limit. 
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For K — 2, the threshold is known exactly: ac(2) = 1. For > 3, there is no 
rigorous proof of the existence of a phase transition so far but many theoretical 
and numerical results strongly support it, see articles by Achlioptas & Franco 
and Dubois & Kirousis in the present issue. Current best estimates indicate 
that the threshold of random 3-SAT is located at q;c(3) ~ 4.25. Statistical 
physics studies show that the order of the phase transition depends on K, the 
transition being continuous for 2-SAT and of first order for 3-SAT (and higher 
values of K) . 



2.1.6 Macroscopic vs. microscopic descriptions. 

What can be inferred from the above examples? First, a (physical) system may 
be found in totally different phases with very different macroscopic properties 
although its intrinsic composition at a microscopic level (molecules, magnetic 
spins, base pairs, clauses, ...) is the same. However, from a physical, mechani- 
cal, electrical, biological, computational, ... point of view, essential properties 
of this system change completely from a phase to another. Second, the abrupt 
change of phase follows from very slight modifications of a control parame- 
ter e.g. temperature, torque, ratio of clauses per variable ... about a critical 
value. Thirdly, critical exponents, that characterize quantitatively second or- 
der phase transitions, are universal, that is, insensitive to many details of the 
systems under study. Last of all, transitions appear for large systems only. 

The above points raise some fundamental questions: how can the main features 
of a system at a macroscopic level, defining a phase, change abruptly and 
how are these features related to the microscopic structure of the system? 
Statistical physics focuses on these questions. 



2.2 Foundations of statistical mechanics and relationship with combinatorics. 
2.2.1 Needs for a statistical description. 

Statistical physics aims at predicting quantitatively the macroscopic behaviour 
of a system (and in particular its phases) from the knowledge of its microscopic 
components and their interactions. What do we mean by interaction? Consider 
for instance a liquid made of N small particles (idealized representations of 
atoms or molecules) occupying positions of coordinates fj in Euclidean space 

— * 

where label i runs from 1 to A^. Particle number i is subject to a force fi 
(interaction) due to the presence of neighboring particles; this force generally 
depends of the relative positions of these particles. To determine the positions 
of the particles at any later time t, we must integrate the equations of motion 
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given by Newton's fundamental law of mechanics, 

m^^Mirj}), ii^i,...,N), (1) 

where rrii is the mass of particle i. Solving these equations cannot be done in 
practice. The forces fi are indeed highly non linear functions of the particle 
positions fj. We therefore wind up with a set of complicated coupled differen- 
tial equations whose number N, of order ~ 10^^, is gigantic and not amenable 
to analytical treatment. 

This impossibility, added to the intuitive feeling that understanding macro- 
scopic properties cannot require the exact knowledge of all microscopic trajec- 
tories of particles has been circumvented by a totally different approach. The 
basic idea is to describe the system of particles in a probabilistic way in order 
to deduce macroscopic features as emergent statistical properties. 



2.2.2 Probability distribution over the set of configurations. 

The implementation of this idea has required the introduction of revolutionary 
concepts at the end of the ninteenth century by Boltzmann and followers, 
and in particular, the ideas of ergodicity and thermodynamical equilibrium. 
We shall not attempt here to provide an exposition of these concepts. The 
interested reader can consult textbooks e.g. [9-11]. As far as combinatorial 
aspects of statistical mechanics are concerned, it is sufficient to start from the 
following postulate. 

A configuration C of the system, that is, the specification of the particle 
positions {rj}, has a probability p{C) to be realized at any time when the 
system is in equilibrium. In other words, the system will be in configuration 
C with probability p{C). The latter depends on temperature T and equals 

p(C) = i exp (-1 . (2) 

In the above expression, E is the energy and is a real- valued function, over the 
set of configurations. The partition function Z ensures the correct normaliza- 
tion of the probability distribution p, 

Z = ^exp(-i£;(C)) . (3) 

Note that we have used a discrete sum over configurations C in (3) instead of 
an integral over particle positions fj. This notation has been chosen since all 
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the partition functions we shall meet in the course of studying optimization 
problems are related to finite {i.e. discrete) sets of configurations. 

Consider two limiting cases of (2): 

• infinite temperature T = oo: the probability p{C) becomes independent 
of C. All configurations are thus equiprobable. The system is in a fully 
"disordered" phase, like a gas or a paramagnet. 

• zero temperature T — 0: the probability p{C) is concentrated on the min- 
imum of the energy function E, called the ground state. This minimum 
corresponds to a configuration where all particles are at mechanically sta- 
ble positions, that is, occupy positions carefully optimized so that all 
forces fi vanish. Often, these strong constraints define regular packings of 
particles and the system achieves a perfect crystaUine and "ordered" state. 

When varying the temperature, intermediate situations can be reached. We 
now examine some simple examples. 

2.2.3 Cases of one and two spins. 

We now consider the case of a single abstract particle that can sit at two 
different positions only. This simple system can be recast as follows. Let us 
imagine an arrow capable of pointing in the up or down directions only. This 
arrow is usually called a spin and the direction is denoted by a binary variable 
a, equal to if the spin is up, to —1 if the spin is down. 

In this single particle system, there are only two possible configurations C = 
{+1} and C = { — 1} and we choose for the energy function E{(t) = —a. Note 
that additive constants in E have no effect on (2) and multiplicative constants 
can be absorbed in the temperature T. The partition function can be easily 
computed from (3) and reads Z = 2 cosh (5 where (5 — 1/T denotes the inverse 
temperature. The probabilities that the spin points up or down are respectively 
= exp(/3)/Z and p_ = exp{—f3)/Z. At infinite temperature {(3 = 0), the 
spin is indifferently up or down: p{+l) = p(— 1) = 1/2. Conversely, at zero 
temperature, it only points upwards: p(+l) = l,p{—l) = 0. C = {+1} is the 
configuration of minimum energy. 

The average value of the spin, called magnetization is given by 

m = ((7)r = X] P('^) ^ tanh(/3) . (4) 

(T=±l 

The symbol {■)t denotes the average over the probability distribution p. Notice 
that, when the temperature is lowered from T = oo down to T = 0, the 
magnetization increases smoothly from m = up to m = 1. There is no 
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abrupt change (singularity or non analyticity) in m as a function of (5 and 
therefore no phase transition. 



Exercise 1: Consider two spins ai and (72 with energy function 

E{ai,a2) = -aia2 . (5) 

Calculate the partition function, the magnetization of each spin as well as the 
average value of the energy. Repeat these calculations for 

-E((7i,(72) = -(7i - (72 . (6) 

How is the latter choice related to the single spin case? 
2.2.4 Combinatorial meaning of the partition function. 

We have so far introduced statistical mechanics in probabihstic terms. There 
exists also a close relationship with combinatorics through the enumeration of 
configurations at a given energy; we now show this relationship. 

The average value of the energy may be computed directly from the definition 

{E)T^Y.PiC)E{C) , (7) 
c 

or from the partition function Z via the following identity 

Wr = -^lnZ . (8) 

that can easily derived from (3). The identity (8) can be extended to higher 
moments of the energy. For instance, the variance of E can be computed from 
the second derivative of the partition function 

{E')T-{E)l=^^\nZ . (9) 

Such equalities suggest that Z is the generating function of the configuration 
energies. To prove this statement, let us rewrite (3) as 



Z = ^exp(-/? E{C)) 
c 
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= ^7V(£;) exp (-/?£;) 

E 



(10) 



where N{E) is the number of configurations C having energies E[C) precisely 
equal to ii^. If a; = exp(— /3), Z{x) is simply the generating function of the 
coefficients N{E) as usually defined in combinatorics. 

The quantity S{E) = In N{E) is called the entropy associated with the en- 
ergy E. In general, calculating S{E) is a very hard task. Usually, it is much 
more convenient to define the average entropy {S)t at temperature T as the 
contribution to the partition function which is not directly due to energy, 

{S)T-~[F{T)-{E)r) , (11) 

where 

F(r) = -rinz(r) (12) 

is called the free-energy of the system. 

In general, the above definitions for the energy and temperature dependent 
entropies do not coincide. However, as explained in next Section, in the large 
size limit {S)t equals S{E) provided that the energy E is set to its thermal 
average E — {E)t- 

The entropy is an increasing function of temperature. At zero temperature, it 
corresponds to the logarithm of the number of absolute minima of the energy 
function E{C). 

Exercise 2: Prove this last statement. 

2.2.5 Large size limit and onset of singularity. 

We have not encountered any phase transition in the above examples of sys- 
tems with one or two spins. A necessary condition for the existence of a tran- 
sition in a system is indeed that the size of the latter goes to infinity. The 
mathematical reason is simple: if the number of terms in the sum (3) is finite, 
the partition function Z, the free-energy F, the average energy, ... are ana- 
lytic functions of the inverse temperature (5 and so do not have singularities 
at finite temperature. 

Most analytical studies are therefore devoted to the understanding of the 
emergence of singularities in the free-energy when the size of the system goes 
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to infinity, the so-called thermodynamic limit. 



An important feature of the thermodynamic limit is the concentration of mea- 
sure for observables e.g. energy or entropy. Such quantities do not fluctuate 
much around their mean values. More precisely, if we call N the size, i.e. the 
number of spins, of the system, the moments of the energy usually scale as 



{E)t^O{N) 

{E')t-{E)1^0{N) , (13) 

and. thus the energy of a configuration is with high probability equal to the 
average value up to 0{^/N) fluctuations. Such a result also applies to the 
entropy, and {S)t = S{{E)t) up to 0{\/N) terms. Measure concentration in 
the thermodynamic limit is a very important and useful property, see [14]. 



2.3 Spin model on the complete graph. 



We shall now study a system of N spins, called the Ising model, exhibiting a 
phase transition in the limit N ^ oo. We consider the complete graph K^^; 
each vertex is labelled by an integer number i = 1. . . . , N and carries a binary 
spin (Tj. The energy function of a configuration C = {cJi, cr at} is given by 

E{ai,. . . ,aN) ^ -^Ylai<^j - hY^CTi ■ (14) 

i<j i 



2.3.1 Remarks on the energy function. 

The first term in (14) is called the interaction term. The sum runs over all 
pairs of spins, that is over all edges of Kp^. The minus sign ensures that the 
minimum of energy is reached when all spins point in the same direction. This 
direction depends on the second term of (14) and, more precisely, upon the 
sign of the "magnetic field" h. If the latter is positive (respectively negative), 
the ground state is obtained when all spins are up (resp. down). 

In the absence of field (/i = 0), we know the two ground states. The energy 
and entropy at zero temperature can be computed from (14) and (11), 



{E)t=o^-\{N-1) , (15) 
(>5)T=o = ln2 . (16) 
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Notice that the ground state energy is 0{N) due to the presence of the factor 
in (14) whereas the entropy is 0(1). 



At infinite temperature, all configurations are equiprobable. The partition 
function is simply equal to the total number of configurations: Zt=oo — 2^, 
leading to 



(£;)t=oo = o , (17) 

(-5)r=cx. = A^ ln2 . (18) 



When the temperature is finite, a compromise is realized in (10) between 
energy and entropy: the configurations with low energies E have the largest 
probabilities but the most probable energy also depends on the entropy, i.e. on 
the size of the coefficients N{E). Temperature tunes the relative importance 
of these two opposite effects. The phase transition studied in this section 
separates two regimes: 

• a high temperature phase where entropy effects are dominant: spins config- 
urations are disordered and spins do not point in any priviledged direction 
(for h = 0). The average magnetization m vanishes. 

• a low temperature phase where energy effects dominate: spins have a ten- 
dency to align with each other, resulting in ordered configurations with a 
non zero magnetization m — {(Ti)T 7^ 0. 

Let us stress that the energy and the entropy must have the same orders of 
magnitude {—0{N)) to allow for such a compromise and thus for the existence 
of a phase transition at finite strictly positive temperature. 



2.3.2 The magnetization is the order parameter. 

We start by defining the magnetization of a configuration C — {(Ji, . . . , (7jv} 
as 

1 ^ 

m{C)^-Y.cT. . (19) 



The calculation of the partition function relies on the following remark. The 
energy function (14) depends on the configuration C through its magnetization 
m(C) only. More precisely, 

E{C)^-N{^m{Cf + hm{C))+\ . (20) 
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Fig. 1. Entropy s{m) of the Ising model on the complete graph as a function of 
magnetization m. 

In the following, we shall also need the entropy at fixed magnetization S{m). 
Configurations with a fixed magnetization m have spins up and spins 
down with 



1 + m 
2 



N_^N [^^\ . (21) 



2 

The number of such configurations is therefore given by the binomial coefficient 

e^M = . (22) 

In the large iV limit, Stirling's formula gives access to the asymptotic expres- 
sion of the entropy density, s(m) = S{m)/N, at fixed magnetization. 



Figure 1 displays s(m) as a function of m. The maximum is reached at zero 
magnetization (s(0) = In 2) and the entropy vanishes on the boundaries m — 
±1. 

Let us stress that S{m) defined in (23) is the entropy at given magnetization 
and difi^ers a priori from the energy and temperature dependent entropies, 
S{E) and {S)t, defined above. However, in the thermodynamic limit, all quan- 
tities are equal provided that m and E coincide with their thermal averages. 
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{m)T and {E)t- 



The average value (m)T of the magnetization will be shown to vanish in the 
high temperature phase and to be different from zero in the low temperature 
phase. The magnetization is an order parameter: its value (zero or non-zero) 
indicates in which phase the system is. 



2.3.3 Calculation of the free-energy. 
The partition function Z reads 



exp[-/3i;(ai,...,aAr)] 

C1,...,CTJV=±1 

^ exp[-7V/3/(m)] , (24) 

m=-l 

where 

/(m) = — - — hm — T s{m) , (25) 



up to 0{1/N) terms. For the moment, we shall take /i = 0. 

In the limit of an infinite number of spins, the free-energy may be computed 
by means of the saddle-point (Laplace) method. We look for the saddle-point 
magnetization m* (that depends upon temperature T) minimizing /(m) (25). 
The latter is plotted in Figure 2 for three different temperatures. 

It can be seen graphically that the minimum of / is located at m* = when 
the temperature is larger than = 1 while there exist two opposite minima, 
m = —m*{T) < 0, m = m*{T) > below this critical temperature. The 
optimum magnetization is solution of the saddle-point equation, 

m* = tanh {(5 m*) , (26) 

while the free-energy is given by 

/(r)= hm -^lnZ = /V) ■ (27) 

N—KX> iV 

The average energy and entropy per spin (divided by N) can be computed 
from (27, 8, 11), 
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Fig. 2. Frcc-cncrgy function /(m) of the 
function of the magnetization m in zero 
temperatures, a: high temperature T = 1 
temperature T = 0.8. 

{s)T^s{m*) 



Ising model on the complete graph as a 
magnetic field h and for three different 
.2, b: critical temperature T = 1, c: low 

(28) 
(29) 
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2.3-4 Phase transition and symmetry breaking. 

In the absence of a magnetic field, the energy (14) is an even function of 
the spins: the probabihty of two opposite configurations {iTi, . . . , ctat} and 
{— (Ti, . . . , — o"Ar} are equaL As a consequence, the thermal average (a)^ of any 
spin vanishes. This result is true for any N and so, in the large limit, 

lim lim (a)T — 

N-*oo h-*0 ^ ' 

It is thus necessary to unveil the meaning of the saddle-point magnetization 
m* arising in the computation of the partition function. 

To do so, we repeat the previous calcTilation of the free-energy in presence of 
a magnetic field h > 0. The magnetization is now different from zero. At high 
temperature T > Tc, this magnetization decreases as the magnetic field h is 
lowered and vanishes when h — 0, 

hrn lim (<j)t = (T > T,) 

h—*0+ iv— »oo 

Therefore, at high temperature, the inversion of limits between (30) and (31) 
has no effect on the final result. 

The situation drastically changes at low temperature. When T < T^, the 
degeneracy between the two minima of / is lifted by the magnetic field. Due 
to the field, a contribution —h m must be added to the free-energy (25) and 
favours the minimum in m* over that in —m*. The contribution to the partition 
function (24) coming from the second minimum is exponentially smaller than 
the contribution due to the global minimum in m* by a factor exp{—2N(3hm*). 
The probability measure on spins configurations is therefore fully concentrated 
around the global minimum with positive magnetization and 

lim lim (cr)^ = m* (T < Tc) 

ft,— ►0+ N-*oo 

Prom (30) and (32), the meaning of the phase transition is now clear. Above 
the critical temperature, a small perturbation of the system (e.g. a term in 
the energy function pushing spins up), is irrelevant: as the perturbation dis- 
appears (/i — >• 0), so do its effects (m* — > 0), see (31). Conversely, below the 
critical temperature, a small perturbation is enough to trigger strong effects: 
spins point up (with a spontaneous magnetization m* > 0) even after the per- 
turbation has disappeared {h = 0), sec (32). At low temperature, two phases 
with opposite magnetizations m* and —m* coexist. Adding an infinitesimal 
field h favours and selects one of them. In more mathematical terms, the 
magnetization m is a non-analytic and discontinuous function oi h at h — 0. 
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(30) 



(31) 



(32) 



So, the phase transition here appears to be intimately related to the notion of 
symmetry breaking. In the case of the Ising model, the probability distribution 
over configurations is symmetrical, that is, left unchanged under the reversal 
of spins (T — > — (T. A high temperature, this symmetry also holds for average 
quantities: ((j)t = 0. At low temperature, the reversal symmetry is broken 
since, in presence of an infinitesimal perturbation, ((7)t = tu* ^ 0. The initial 
symmetry of the system implies only that the two possible phases of the system 
have opposite magnetizations m* and —m*. 

In the present case, the symmetry of the system was easy to identify, and to 
break! We shall see that more abstract and complex symmetries may arise 
in other problems, e.g. the random graph and K-Satisfiability. The under- 
standing of phase transitions very often will rely on the breaking of associated 
symmetries. 

Exercise 3: How does equation (26) become modified when there is a non-zero 
magnetic field? Calculate explicitely the free- energy in presence of a magnetic 
field and check the correctness of the above statements. 

2.3.5 Vicinity of the transition and critical exponents. 

To complete the present analysis, we now investigate the properties of the 
Ising model close to the critical temperature = 1 and define T — l-\-r with 
|t| <^ 1. The spontaneous magnetization reads from (26), 



Thus the magnetization grows as a power of the shifted temperature r: m* (r) ~ 
{—t)^ with (3 = 1/2. (3, not to be confused with the inverse temperature, is 
called a critical exponent since it characterizes the power law behaviour of a 
physical quantity, here the magnetization, close to criticality. Such exponents 
are universal in that they are largely independent of the "details" of the defi- 
nition of the model. We shall come back to this point in the sections devoted 
to the random graph and the K-Satisfiabihty models. 

Another exponent of interest is related to the finite size effect at the transition. 
So far, we have calculated the average values of various quantities in the infinite 
size limit N ^ oo. We have in particular shown the existence of a critical 
temperature separating a phase where the sum of the spins is on average zero 
(r > 0) from a phase where the sum of the spins acquires an 0{N) mean 
(r < 0). At the transition point (r = 0), we know that the sum of spins 
cannot be of order N; instead we have a scaling in with 5 < 1. 




(33) 
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What is the value of 57 Prom expression (24). let us expand the free-energy 
function /(m) (25) in powers of the magnetization m = 0{N^~^), 

/(m) -/(O) = ^ + i- m^ + 0(m^Tm^) , (34) 

with /(O) = —Tin 2. Above the critical temperature, r > 0, the average 
magnetization is expected to vanish. Due to the presence of the quadratic 
leading term in (34), the fluctuations of m are of the order of N~^/'^. The 
sum of the spins, N has a distribution whose width grows as N^^"^^ giving 
5 = 1/2. 

At the critical temperature, the partition function reads from (24), 

Z~2^ |dme-^"*'/^2 . (35) 

The average magnetization thus vanishes as expected and fluctuations are of 
the order of N'^'^. The sum of the spins, N thus has a distribution whose 
width grows as giving (5 = 3/4. 

The size of the critical region (in temperature) is defined as the largest value 
Tjnax of the shifted temperature r leaving unchanged the order of magnitude 

of the fluctuations of the magnetization m. A new critical exponent v that 
monitors this shift is introduced: Tmax ~ N~^/^ . Demanding that terms on the 
r.h.s. of (34) be of the same order in A^, we find u = 2. 

2.4 Randomness and the replica method. 

The above analysis of the Ising model has been useful to illustrate sonic classic 
analytical techniques and to clarify the concept of phase transitions. However, 
most optimization or decision problems encountered in computer science con- 
tain another essential ingredient we have not discussed so far, namely ran- 
domness. To avoid any confusion, let us stress that randomness in this case, 
e.g. a Boolean formula randomly drawn from a well-defined distribution, and 
called quenched disorder in physics, must be clearly distinguished from the 
probabilistic formulation of statistical mechanics related to the existence of 
thermal disorder, see (2). As already stressed, as far as combinatorial aspects 
of statistical mechanics are concerned, we can start from the definition (10) of 
the partition function and interpret it as a generating function, forgetting the 
probabilistic origin. On the contrary, quenched disorder cannot be omitted. 
We are then left with combinatorial problems defined on random structures, 
that is, with partition functions where the weights themselves are random 
variables. 



19 



2.4-1 Distribution of "quenched" disorder. 



We start with a simple case: 



Exercise 4: Consider two spins cxi and a2 with energy function 



-E'((Ti,(T2) = — J (7i (72 



(36) 



where J is a real variable called coupling. Calculate the partition function, 
the magnetization of each spin as well as the average value of the energy at 
given (quenched) J. Assume now that the coupling J is a random variable 
with measure p{J) on a finite support [J_; J+]. Write down the expressions of 
the mean over J of the magnetization and energy. What is the value of the 
average ground state energy? 

The meaning of the word "quenched" is clear from the above example. Spins 
are always distributed according to (2) but the energy function E now depends 
on randomly drawn variables e.g. the coupling J. Average quantities (over the 
probability distribution p) must be computed keeping these random variables 
fixed (or quenched) and thus are random variables themselves that will be 
averaged over J later on. To distinguish both kinds of averages we hereafter 
use an overbar to denote the average over the quenched random variables while 
brackets still indicate a thermal average using p. 

Models with quenched randomness are often very difficult to solve. One of the 
reasons is that their physical behaviour is more complex due to the presence 
of frustration. 

2.4.2 Notion of frustration. 

Frustration is best introduced through the following simple example. 
Exercise 5: Consider three spins a\, ui and with energy function 



Calculate the partition function, the magnetization of each spin as well as the 
average value of the energy. What are the ground state energy and entropy? 

Repeat the calculation and answer the same questions for 



E{ai, (72, (73) = —(71(72 — (7i(73 — (72(73 



(37) 



E{ai, (72, CTa) = — (7i(72 — (7i(73 + (72(73 



(38) 
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Note the change of the last sign on the r.h.s. of (38). 

The presence of quenched disorder with both negative and positive couphngs 
generates frustration, that is conflicting terms in the energy function. A famous 
example is the Sherrington-Kirkpatrick (SK) model, a random version of the 
Ising model on the complete graph whose energy function reads 



where the quenched couplings Jjj are independent random normal variables. 
In the SK model, contrarily to the Ising model, the product of the couplings 
Jij along the loops of the complete graph Kn may be negative. The ground 
state is no longer given by the "all spins up" configuration, nor by any simple 
prescription and must be sought for among the set of 2^ possible configura- 
tions. Finding the ground state energy for an arbitrary set of couplings Jjj is 
a hard combinatorial optimization task which in this case belongs to the class 
of NP-hard problems [15,16]. 

2. 4 -3 Thermodynamic limit and self-averaging quantities. 

Though physical quantities depend a priori on quenched couplings, some sim- 
plifications may take place in the large size limit — > oo. Many quantities 
of interest may exhibit less and less fiuctuations around their mean values 
and become self- averaging. In other words, the distributions of some random 
variables become highly concentrated as N grows. Typical examples of highly 
concentrated quantities are the (free-)energy, the entropy, the magnetization, 
... whereas the partition function is generally not self-averaging. 

Self-averaging properties are particularly relevant when analyzing a problem. 
Indeed, for these quantities, we only have to compute their average values, not 
their full probability distributions. We shall encounter numerous examples of 
concentrated random variables later in this article. 

Exercise 6: Show that the partition function of the SK model is not self- 
averaging by calculating its first two moments. 

2.4.4 Replica method. 

We consider a generic model with spins (Tj and an energy function E{C, J) 
depending on a set of random couplings J. Furthermore we assume that the 
free-energy F{J) of this model is self-averaging and would like to compute its 





1 



(39) 
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quenched averaged value F{J) or, equivalently from (12), the averaged loga- 
rithm of the partition function \nZ(J). Though well posed, this computation 
is generally a very hard task from the analytical point of view. An original but 
non rigorous method, the replica approach, was invented by Kac in the sixties 
to perform such calculations. The starting point of the replica approach is the 
following expansion 

Z(J)'* = 1 + n InZ(J) + 0(^2) , (40) 



valid for any set of couplings J and small real n. The identity (40) may be 
averaged over couplings and gives the mean free-energy from the averaged n*'* 
power of the partition function 



z{jy 



If we restrict to integer n, the n*" moment of the partition function Z can be 
rewritten as 



Z{J)^- 



c ^ 



^ E exp -- j:e{c^j) 

Ci ...,C" V a=l / 



(42) 



This last expression makes transparent the principle of the replica method. 
We have n copies, or replicas, of the initial problem. The random couplings 
disappear once the average over the quenched couplings has been carried out. 
Finally, we must compute the partition function of an abstract system of N 
vectorial spins Ui — {a}, . . . , erf) with the non random energy function 



Eeffm}) = -T In 



1 



a=l 



(43) 



This new partition function can be estimated analytically in some cases by 
means of the saddle-point method just as we did for the Ising model. The 
result may be written formally as 

Z(J)^ = exp ( - iV/(n)) , (44) 

to leading order in A^. On general grounds, there is no reason to expect the 
partition function to be highly concentrated. Thus, f{n) is a non linear func- 
tion of its integer argument n satisfying /(O) = 0. The core idea of the 
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replica approach is to continue analytically / to the set of real n and ob- 
tain F{J) = TNdf/dn evaluated at n = 0. The existence and uniqueness of 
the analytic continuation is generally ensured for finite sizes N due to the mo- 
ment theorem. In most problems indeed one succeeds in bounding |2'(J)| from 
above by a (J independent) constant C. The moments of Z grow only exponen- 
tially with n and their knowledge allows for a complete reconstruction of the 
probability distribution of Z{J). However this argument breaks down when 
the saddle-point method is employed and the upper bound C = exp{0{N)) 
becomes infinite. 

Though there is generally no rigorous scheme for the analytic continuation 
when N ^ oo, physicists have developped in the past twenty years many 
empirical rules to use the replica method and obtain precise and sometimes 
exact results for the averaged free-energy. We shall see in the case of the 
K-Satisfiability problem how the replica approach can be applied and how 
very peculiar phase transitions, related to the abstract "replica" symmetry 
breaking, are present. 

The mathematician or computer scientist reader of this brief presentation may 
feel uneasy and distrustful of the replica method because of the uncontrolled 
analytic continuation. To help him/her loose some inhibitions, he/she is asked 
to consider the following warming up exercise: 

Exercise 7: Consider Newton's binomial expression for (1 + a;)" with integer 
n and perform an analytic continuation to real n. Take the n — > limit and 
show that this leads to the series expansion in x o/ln(l -|- x). 



3 Random Graphs 

In this section, we show how the statistical mechanics concepts and techniques 
exposed in the previous section allow to reproduce some famous results of 
Erdos and Renyi on random graphs[17]. 

3. 1 Generalities 

First let us define the random graphs used. Consider the complete graph 
over N vertices. We define Gn,Nl ^ the set of graphs obtained by taking only 
Nl = 7 N/2 among the (^^^ edges of Kn in all possible different ways. A 
random graph is a randomly chosen element of Gn,Nl with the fiat measure. 
Other random graphs can be generated from the complete graph through 
a random deletion process of the edges with probability 1 — ^/N. In the large 
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N limit, both families of random graphs share common properties and we shall 
mention explicitely the precise family we use only when necessary. 

3.1.1 Connected components. 

We call "clusters" the connected components of a given graph G; the "size" of 
a cluster is the number of vertices it contains. An isolated vertex is a cluster of 
size unity. The number of connected components of G is denoted by C(G) and 
we shall indicate its normalized fraction by c{G) = ^. If c is small, the random 
graph G has few big clusters whereas for c approaching unity there arc many 
clusters of small size. Percolation theory is concerned with the study of the 
relationship between the probability p of two vertices being connected with the 
typical value of c in the N ^ oo limit. The scope of this section is to show how 
such a relationship can be exploited by the study of a statistical mechanics 
model, the so called Potts model, after a suitable analytic continuation. As 
a historical note, let us mention that analytic continuations have played an 
enormous role in physics this last century, leading often to unexpected deep 
results, impossible or very difficult to obtain by other means. 

3.1.2 Generating function for clusters. 

Let V{G) be the probability of drawing a random graph G through the deletion 
process from the complete graph K^. Since the edge deletions are statistically 
independent, this probability depends on the number of edges only, and 
factorizes as 

p(G) = p^.(«)(i-p)^-^^(G) , (45) 

where 

1-p^l-l (46) 

is the probability of edge deletion. We want to study the probability density 
p(c) of generating a random graph with c clusters, 

p{c) = Y.nG)S{c-c{G)) , (47) 

G 

where S indicates the Dirac distribution. 

We can introduce a generating function of the cluster probability by 
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Y{q) = J dcp{c)q'"^ 



1 

= /dcg^^ E V{G)5{c - c{G)) 



GCK„ 



= J2 nG)q^^''^^ E p''^''\l - P)^-''^''^^^''^ (48) 

GCKm GCKm 

with q being a formal (eventually real) parameter. 



3.1.3 Large size limit. 

In the large size limit, p(c) is expected to be highly concentrated around some 
value 0(7) equal to the typical fraction of clusters per vertex and depend- 
ing only the average degree of valency 7. Random graphs whose c{G) differs 
enough from 0(7) will be exponentially rare in N. Therefore, the quantity 

u;(c) = lim ^logp(c) (49) 

iV— »oo iV 



should vanish for c = 0(7) and be strictly negative otherwise. In the following, 
we shall compute uj{c) and thus obtain information not only on the typical 
number of clusters but also on the large deviations (rare events) . 

Defining the logarithm f{q) of the cluster generating function as 

/(?)= lim ^logy(g) , (50) 



we obtain from a saddle-point calculation on c, see (48,49), 

f(q) — max c lno-|-a;(c) 

0<c<l ^ ^ 



(51) 



In other words, / and uu are simply conjugated Legendre transforms. It turns 
out that a direct computation of / is easier and thus prefered. 



3.2 Statistical mechanics of the random graph. 



Hereafter, we proceed to compute the properties of random graphs by using 
a mapping to the so-called Potts model. Some know results can be rederived 
by the statistical mechanics approach, and additional predictions are made. 
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3.2.1 Presentation of the Potts model. 

The Potts model[18] is defined in terms of an energy function which depends 
on N spin variables CTj, one for each vertex of the complete graph Kj^, which 
take q distinct values ai — 0,1, ...,q — 1. The energy function reads 

^[W] = -E%-^.) ' (52) 

i<j 

where S{a, b) is the Kronecker delta function: S{a, b) = 1 if a = 6 and 5{a, b) — 
a a ^ b. The partition function of the Potts model is 

Zpotts^ J2 exp[/3^5((7j,(Tj)] (53) 

{ai=0,...,q-l} i<j 

where f3 is the inverse temperature and the summation runs over all spin 
configurations. 

In order to identify the mapping between the statistical mechanics features of 
the Potts model and the percolation problem in random graphs we compare 
the expansion of Zpous to the definition of the cluster generating function of 
the random graphs. 

3.2.2 Expansion of the Potts partition function. 

Following Kasteleyn and Fortuin [19], we start by rewriting ZpoUs ^ a dichro- 
matic polynomial. Upon posing 

v = e^-l , (54) 
one can easily check that (53) can be recast in the form 

Zpotts — 

When cTj and aj take the same value there appears a factor (1 + w) in the 
product (corresponding to a term in (53)); on the contrary, whenever (jj 
and (7j are different the product remains unaltered. The expansion of the above 
product reads 

+ v'^ 51 5{(Ti,(Tj)5{(Tk,(Tl) ^ ]. (56) 

i<j,k<l/{i,j)^{k,l) 
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N(N-l) 

We obtain 2 2 terms each of which composed by two factors, the first one 
given by v raised to a power equal to the number of Ss composing the second 
factor. It follows that each term corresponds to a possible subset of edges on 
Kn, each edge weighted by a factor v. There is a one-to-one correspondence 
between each term of the sum and the sub-graphs G of K^. The edge structure 
of each sub-graph is encoded in the product of the Ss. This fact allows us to 
rewrite the partition function as a sum over sub-graphs 

L(G) 

Zpous-T. E [^'^^''^ n^K^^.J] (57) 

{ai} GQKn k=0 



where L{G) is the number of edges in the sub-graph G and ik,jk ai'c the 
vertices connected by the k-th edge of the sub-graph. We may now exchange 
the order of the summations and perform the sum over the spin configurations. 
Given a sub-graph G with L links and C clusters (isolated vertices included), 
the sum over spins configurations will give zero unless all the o"s belonging to 
a cluster of G have the same value (cf. the 6 functions). In such a cluster, one 
can set the as to any of the q different values and hence the final form of the 
partition function reads 

Zpous = E ^"(^^g^^^^ • (58) 

GCKn 



3.2.3 Connection with the cluster generating function 
If we now make the following identification 

p = 1-6-^ = ^/(1 + 'y) , (59) 

we can rewrite the partition function as 

= (l-p)-^ ^ /(«)(l-p)^-^(^)g^(«) . (60) 

Computing the prefactor on the r.h.s. of (60), we have 

Zpotts^e-Y{q) , (61) 



for terms exponential in N. Y is the cluster generating function of the graph 
(48). The large N behaviour of the cluster probability u{c) is therefore related 
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to the Potts free-energy, 



/potts ( g) = - lim -^InZpott^ , (62) 

AT— ►oo p iV 



through 



- 77 - fpottsiq) = (c In g + a;(c)) . (63) 

Z 0<c<l 



We are interested in finding the value c*{q) which maximizes the r.h.s. in (63); 
since 



duj{c) 



dc 



= -\nq (64) 

c*{q) 



it foUows that uo takes its maximum value for g = 1. Differentiating eq. (63) 
with respect to g, we have 



which, in virtue of eq. (64) becomes: 



c*{q) = -q ^{q) ■ (66) 



It is now clear that the typical fraction of clusters per site, c*{q — 1), can be 
obtained, at a given connectivity 7, by computing the Potts free-energy in the 
vicinity of q = 1. Since the Potts model is originally defined for integer values 
of q only, an analytic continuation to real values of q is necessary. We now 
explain how to perform this continuation. 



3.2.4 Free- energy calculation. 

As in the case of the Ising model of section II, a careful examination of the 
energy function (52) shows that the latter depends on the spin configuration 
only through the fractions x{a; {cTj}) of variables (jj in the cr-th state {a — 
0,l,---,?-l)[20], 

1 ^ 

^{<^'d<^i}) l^Y.^{<^h<^)^ (o- = 0, 1) . (67) 

i=l 
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Of course, I]o- .x(o'; {(jj}) = 1. Note that in the Ising case {q — 2) the two 
fractions x{0) and x{l) can be parametrized by a unique parameter e.g. the 
magnetization m = {x{l) — x{Q))/2. 

Using these fractions, the energy (52) may be rewritten as 

^[W] = -^E[^('^;{^^})]' + T ■ (68) 

Note that the last term on the r.h.s. of (68) can be neglected with respect to 
the first term whose order of magnitude is O(iV^). 

The partition function (53) at inverse temperature (3 — ^/N now becomes 
Zpotts = E exp I AT E[^(^, M)?] 

K=0,l,...?-1} \ ^ a=0 J 

- E exp |iVEWf ^,-1.^- 

{a:<,=0,l/iV,...,l} \^ a=0 J Ul^oW X {a)\\ 

1 (R) 

= J UlZ\dx{a) exp{-Nf[{x{a)}]) (69) 



to the leading order in N. The subscript (R) indicates that the sum or the 
integral must be restricted to the normalized subspace J2lz}Qx(a) — 1. The 
"free-energy" density functional / appearing in (69) is 

fiM^)}] = E| - l[x{a)r + x{a)lnx{a)] . (70) 

a=0 ^ J 

In the limit of large N, the integral in (69) may be evaluated by the saddle- 
point method. The Potts free-energy (62) then reads 

fpottsiq) = rain f[{x„}] (71) 

{x{a}} 

and the problem becomes that of analyzing the minima of /. Given the initial 
formulation of the problem, each possible value of a among 0, . . . ,q — 1 plays 
the same role; indeed / is invariant under the permutation symmetry of the 
different q values. However, we should keep in mind that such a symmetry 
could be broken by the minimum (see section 2). We shall see that depending 
on the value of the connectivity 7, the permutation symmetry may or may 
not be broken, leading to a phase transition in the problem which coincides 
with the birth a giant component in the associated random graph. 
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3.2.5 Symmetric saddle-point. 

Consider first the symmetric extremum of /, 



X^2/"^((7) = -, V(7 = 0,...,?-l. 



(72) 



We have 



/p^o«.(?) = -lng- 



1_ 

2q 



(73) 



Taking the Legendre transform of this free-energy, see (63,66), we get for the 
logarithm of the cluster distribution density 



sym. 



c) 



7 



-(l-c)(l + ln7-ln[2(l-c)]) 



(74) 



|, a result that cannot be true 



a;^^™'(c) is maximal and null at c''^™(7) = 1 
for connectivities larger than two and must break down somewhere below. 
Comparison with the rigorous derivation in random graph theory indicates 
that the symmetric result is exact as long as 7 < 7c = 1 and is false above 
the percolation threshold 7c. The failure of the symmetric extremum in the 
presence of a giant component proves the onset of symmetry breaking. 

To understand the mechanism responsible for the symmetry breaking, we look 
for the local stabihty of the symmetric saddle-point (72) and compute the 
eigenvalues of the Hessian matrix 
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'^'^'^ dx{a)x{T) 



sym,{R) 



(75) 



restricted to the normalized subspace. The simple algebraic structure of M 
allows an exact computation of its g — 1 eigenvalues for a generic integer q. 
We find a non degenerate eigenvalue Aq = q{q — 7) and another eigenvalue 
Ai = g — 7 with multiplicty q — 2. The analytic continuation of the eigenvalues 
to real ? — > 1 lead to the single value A = 1 — 7 which changes sign at the 
percolation threshold 7c. Therefore, the symmetric saddle-point is not a local 
minimum of / above 7c, showing that a more complicated saddle-point has to 
be found. 



3.2.6 Symmetry broken saddle-point. 

The simplest way to break the symmetry of the problem is to look for solutions 
in which one among the q values appears more frequently than the others. 
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Therefore we look for a saddle-point of the form 
x{0)^^[l + {l-q)s] 

x{a) = ^[l-s], {a = l,...,q-l). (76) 

The symmetric case can be recovered in this enlarged subspace of solutions 
by setting s = 0. The free-energy of the Potts model is obtained by plugging 
the fractions (76) into (70). In the hmit — > 1 of interest, 

f[M] = -| + - i)fpotts(s, 7) + 0((q - If) (77) 

with 

fpottsis, 7) = |(1 - -l + s+{l-s) ln(l - s) (78) 

Minimization of fpottsi^,!) with respect to the order parameter s shows that 
for 7 < 1 the symmetric solution s = is recovered, whereas for 7 > 1 there 
exists a non vanishing optimal value s*{'y) of s that is solution of the implicit 
equation 

1 - s* = exp(-7 s*) . (79) 

The stability analysis (which we will not give here) shows that the solution 
is stable for any value of 7. The interpretation of s*{'y) is straightforward: s* 
is the fraction of vertices belonging to the giant cluster. The average fraction 
of connected components 0(7) equals —fpotts{s*{j),j), see (66), in perfect 
agreement with exact results by Erdos and Renyi. 



3.3 Discussion. 

Further results on the properties of random graphs can be extracted from the 
previous type of calculation. We shall examine two of them. 
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3.3.1 Scaling at the percolation point. 



Given the interpretation of >s*(7) for any large but finite value of A^, we may 
define the probability of existence of a cluster containing Ns sites as follows 

'^^'''^^-exp(iV/(.*,7)) ^'"^ 



In the infinite size limit this leads to the expected result 



hm P{s,N)^6{s-s*{^)) (81) 

N—^oo 



In order to describe in detail how sharp (in A^) the transition is at 7 = 1, 
we need to consider corrections to the saddle point solutions by making an 
expansion of the free-energy fpotts{s,^ = 1) in the order parameter s. At 
threshold, we have s*(l) = and fpotts{s, 1) = — + O(s^) and therefore 

P(s, N) ~ exp(-iVsV6) (82) 



In order to keep the probability finite at the critical point the only possible 
scahng for s is s = 0{N~^/^) which leads to a size of the giant component at 
criticality N x N'^^^ — N"^/^, in agreement with the Erdos-Renyi results. 



3.3.2 Large deviations. 

The knowledge of the Potts free-energy for any value of q allows one to compute 
its Legendre transform, uj{c). The computation does not show any difficulty 
and we do not reproduce the results here [21]. Phase transitions are also found 
to take place for rare events (graphs that do not dominate the cluster prob- 
ability distribution). Notice that wc consider here random graphs obtained 
by deleting edges from Kn with a fixed probability. Large deviations results 
indeed depend strongly on the process of generating graphs. 

As a typical example of what can be found using statistical mechanics, let us 
mention this simple result 

^(c = 1) = -| ' (83) 



for all connectivities 7. The above identity means that the probability that a 
random graph has N — o{N) connected components decreases as exp(— 7A^/2) 
when N gets large. This result may be easily understood. Consider for instance 
graphs with 7^" edges made of a complete graph on ^/2rfN vertices plus N — 
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\/2^N isolated vertices. The fraction of connected components in this graph 
is c = 1 — 0{1/\/N) 1. The number of such graphs is simply the number of 
choices of y/2'yN vertices among ones. Taking into account the edge deletion 
probability 1— — 7/iV, one easily recovers (83). 

3.3.3 Conclusion. 

The random graph problem is a nice starting point to test ideas and techniques 
from statistical mechanics. First, rigorous results are known and can be con- 
fronted to the outputs of the calculation. Secondly, analytical calculations are 
not too difficult and can be exploited easily. 

As its main focus, this section aimed at exemplifying the strategy used in 
more complicated, e.g. K- Satisfiability, problems. The procedure of analytic 
continuation, which is at the root of the replica approach, appears nicely in 
the computation of the Potts free-energy and is shown to give exact results 
(though in a non rigorous way) . The power of the approach is impressive. Many 
quantities can be computed and rather subtle effects such as large deviations 
are easily obtained in a unique framework. 

At the same time, the main weakness of the statistical mechanics approach 
is also visible. Most interesting effects are obtained when an underlying sym- 
metry is broken. But the structure of the broken saddle-point subspace is far 
from obvious, in contrast to the Ising case of the previous section. There is at 
first sight some kind of arbitrariness in the search of a saddle-point of the form 
of (76). In the absence of a well-established and rigorous procedure, the sym- 
metry breaking schemes to be used must satisfy at least basic self-consistency 
checks (plausibility of results, local stability, ...). In addition, theoretical physi- 
cists have developed various schemes that arc known to be efficient for various 
classes of problems but (fail in other cases). A kind of standard lore, of precious 
help to solve new problems, exists and is still waiting for firm mathematical 
foundations. 



4 Random K-satisfiability problem 

In what follows we shall describe the main steps of the replica approach to 
the statistical mechanics analysis of the Satisfiability problem. The interested 
reader may find additional details concerning the calculations in several pub- 
hshed papers [22-28] and in the references therein. 

The satisfaction of constrained Boolean formulae is a key issue in complexity 
theory. Many computational problems are known to be NP-complete [15,29] 
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through a polynomial mapping onto the K-Satisfiability (SAT) problem, which 
in turn was the first problem shown to be NP-complete by Cook in 1971 [30]. 

Recently [31], there has been much interest in a random version of the K-SAT 
problem defined as follows. Consider N Boolean variables Xi, i = 1,. . . ,N. 
Call a clause C the logical OR of K randomly chosen variables, each of them 
being negated or left unchanged with equal probabilities. Then repeat this 
process by drawing independently M random clauses Cg, £ = 1, . . . , M. The 
logical AND of all these clauses is a "formula", referred to as F. It is said 
to be satisfiable if there exists a logical assignment of the xs evaluating F to 
true, and unsatisfiable otherwise. 

Numerical experiments have concentrated on the study of the probability 
Pn{ol, K) that a randomly chosen F having M — aN clauses be satisfiable. 
For large sizes, a remarkable behaviour arises: P/v seems to reach unity for 
a < ac{K) and vanishes for a > a^K) when N ^ oo [32,31]. Such an abrupt 
threshold behaviour, separating a SAT phase from an UNSAT one, has indeed 
been rigorously confirmed for 2-SAT, which is in P, with ac{2) = 1 [33,34]. 
For larger K > 3, K-SAT is NP-complete and much less is known. The ex- 
istence of a sharp transition has not been rigorously proved but estimates of 
the thresholds have been found : Q;c(3) ~ 4.3 [35]. Moreover, some rigorous 
lower and upper bounds to ac(3) (if it exists), = 3.14 and au.b. = 4.51 
respectively have been estabhshed (see the review articles dedicated to upper 
and lower bounds contained in this TCS special issue) . 

The interest in random K-SAT arises partly from the following fact: it has 
been observed numerically that hard random instances are generated when 
the problems are critically constrained, i.e., close to the SAT/UNSAT phase 
boundary [32,31]. The study of such hard instances represents a theoretical 
challenge towards an understanding of complexity and the analysis of exact 
algorithms. Moreover, hard random instances are also a test-bed for the op- 
timization of heuristic (incomplete) search procedures, which are widely used 
in practice. 

Statistical mechanics provides new intuition on the nature of the solutions 
of random K-SAT (or MAX-K-SAT) through the introduction of an order 
parameter which describes the geometrical structure of the space of solutions. 
In addition, it gives also a global picture of the dynamical operation of search 
procedures and the computational complexity of K-SAT solving. 

4-1 K-SAT energy and the partition function. 

To apply the statistical physics approach exemplified on the random graph 
problem, one has to identify the energy function corresponding to the K-SAT 
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problem. 



The logical values of an Xi can be represented by a binary variable Si, called 
a spin, through the one-to-one mapping Si = —1 (respectively +1) if Xj is 
false (resp. true). The random clauses can then be encoded into a,n M x N 
matrix Cu in the following way : Ca = — 1 (respectively +1) if the clause 
Ce includes 3^ (resp. Xi), CpA = otherwise. It can be checked easily that 
J2iLi CiiSi equals the number of wrong literals in clause i. Consider now the 
cost-function £"[0, S] defined as the number of clauses that are not satisfied 
by the logical assignment corresponding to configuration S. 

M / N \ 

E[C,S]^J2HJ2CeiSi + K] , (84) 
e=i \i=i J 

where 6{j) = 1 if j = 0, zero otherwise, denotes the Kronecker function. The 
minimum (or ground state -GS) E[C] of i?[C, S], is the lowest number of 
violated clauses that can be achieved by the best possible logical assignment 
[23]. E[C] is a random variable that becomes highly concentrated around its 
average value Eqs = E[C\ in the large size limit [36]. The latter is accessible 
through the knowledge of the averaged logarithm of the generating function 

Z[C]-^exp(-£[C,S]/r) (85) 
s 

since 

Egs^-T log Z[C]+0{T') , (86) 

when the auxiliary parameter T is sent to zero. Being the minimal number of 
violated clauses, Eqs equals zero in the sat region and is strictly positive in 
the unsat phase. The knowledge of Eqs as a function of a therefore determines 
the threshold ratio ac{K). 



4-2 The average over the disorder. 

The calculation of the average value of the logarithm of the partition function 
in (86) is an awkward one. To circumvent this difficulty, we compute the n*^ 
moment of Z for integer-valued n and perform an analytic continuation to real 
n to exploit the identity Z[C]" = 1 -|- nlogZ[C] -|- O(n^). The n^^ moment of 
Z is obtained by replicating n times the sum over the spin configurations S 
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and averaging over the clause distribution [23] 



Z[CY= exp -^£;[C,S«]/r , (87) 

Si,S2,...,S" V a=l / 



which in turn may be viewed as a generating function in the variable e ^^'^ . 

In order to compute the expectation values that appear in eq.(87), one notices 
that each individual term 



4{sn] = exp (-^E£;[c,s«]) 



(88) 



factorises over the sets of different clauses due to the absence of any correlation 
in their probability distribution. It follows 

^[{S"}] = {CK[{S'^}]f , (89) 
where each factor is defined by 



^[{8-^}] = exp 



1 n / AT ^ 
a=l \i=l J 



(90) 



with the bar denoting the uniform average over the set of 2^ vectors of N 
components Cj = 0, ±1 and of squared norm equal to K. 

Resorting to the identity, 

^ ~ , (91) 



\i=i I iiaM 



one may carry out the average over in disorder in eq.(90) to obtain 

1 1 ^ ( ^ n K \ 

a[{S«}] = ^ E 1^ E exp --En^K + ^^] (92) 

up to neghgible 0{1/N) contributions. 

The averaged term in the r.h.s. of (87) depends on the n x N spin values 
only through the 2" occupation fractions x{a) labeled by the vectors a with 
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n binary components; x{a) equals the number (divided by A^) of labels i such 
that S'f = cr", Va = 1, . . . , n. It follows that C/^[{S"}] = CkM where 



(93) 



Ci,...,C/f =±1 Si,...,Sk 



1 



I a=\e=l 



To leading order in (e.g., by resorting to a saddle point integration), the final 
expression of the n^^ moment of Z can be written as Z[CY — exp(— A/" fopt/T) 
where fopt is the optimum (in fact the minimum for integer n) over all possible 
xs of the functional [23] 

m^e[^] + Wx{a)\ogx{a) , (94) 



with 



n K 



e[x] = ctln 



E x{a,)...x{^K) exp (--En ^['^e ' 1] 



■ (95) 



a=l 1 



Note the similarities between equations (94) and (70). While in the random 
graph or Potts model case a took on q values, the K-SAT model requires the 
introduction of 2" vectors a. In both cases, an analytic continuation of the 
free-energy to non integer values of g or n has to be performed. Finally, note 
that the optimum of / fulfills x{a) = x{—a) due to the uniform distribution 
of the disorder C. 



4-3 Order parameter and replica-symmetric saddle-point equations. 



The optimization conditions over /[x] provide 2" coupled equations for the 
xs. Notice that / is a symmetric functional, invariant under any permutation 
of the replicas a, as is evident from equation (87). An extremum may thus be 
sought in the so-called replica symmetric (RS) subspace of dimension n + 1 
where x{a) is left unchanged under the action of the symmetric group. In the 
limit of interest, T — > 0, and within the RS subspace, the occupation fractions 
may be conveniently expressed as the moments of a probability density P{m) 
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over the range — 1 < m < 1 [23], 



/" / 1 + ma"'\ 
dm P{m) Jl j . (96) 



P{m) is not uniquely defined by (96) for integer values of n but acquires 
some precise meaning in the n ^ limit. It is the probability density of the 
expectation values of the spin variables over the set of ground states. Consider 
a formula F and all the spin configurations S*--'-', j = 1, ... ,(5 realizing the 
minimum E[C\ of the cost-function E[C, S], that is the solutions of the MAX- 
SAT problem defined by F. Then define the average magnetizations of the 
spins 

mi-^j:S? , (97) 



over the set of optimal configurations. Call i?(C, m) the histogram of the mjS 
and H{m) its quenched average, i.e., the average of H{C,m) over the random 
choices of the formulae F. H{m) is a probability density over the interval 
— 1 < m < 1 giving information on the distribution of the variables induced 
by the constraint of satisfying all the clauses. In the absence of clauses, all 
assignments are sohitions and all magnetizations vanish: H{m) = S{m) and 
variables are not constrained. Oppositely, variables that always take the same 
value in all solutions, if any, have magnetizations equal to +1 (or —1): such 
variables are totally constrained by the clauses. 

As discussed in ref . [23] , if the RS solution is the global optimum of (94) then 
H{m) equals the above mentioned P{m) in the limit of large sizes oo. 
Therefore, the order parameter arising in the replica calculation reflects the 
"microscopic" structure of the solutions of the K-SAT problem. 

At this stage of the analysis it is possible to perform the analytic continuation 
n — s> since all the functionals have been expressed in term of the generic num- 
ber of replicas n. Such a process leads to a self-consistent functional equation 
for the order parameter P{m), which reads 



P{m)^ 
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with 

A^K-i) ^ A^K-i){{m,} , /3) = 1 + (e-^ - 1) n' 

^=1 ^ ^ ^ 



(99) 



and /3 = 1/T. The corresponding rephca symmetric free-energy density reads 



1 K 



- (3 fopt{a,T) ^\n2 + ail - K) f Y[ dmeP{mt) In A^k) 

_i e=i 

aK }^~^ 
+ — / Yl drriiP (mi) In A(^K-i) 

1 ) 

-- J dmP{m) ln(l - m^) . (100) 

-1 

It can be checked that equation (98) is recovered when optimizing the free- 
energy functional (100) over aU (even) probabihty densities P(m) on the in- 
terval [-1,1]. 

4-4 The simple case of K=l. 



Before entering in the analysis of the saddle-point equations for general K, 
it is worth considering the simple K = 1 case which can be solved either by 
a direct combinatorial method or within the statistical mechanics approach. 
Though random 1-SAT does not present any critical behaviour (for finite a), 
its study allows an intuitive understanding of the meaning and correctness of 
the statistical mechanics approach. 

For K = 1, a. sample of M clauses can be defined completely by giving directly 
the numbers ti and /j of clauses imposing that a certain Boolean variable Si 
must be true or false respectively. The partition function corresponding to a 
given sample reads 

TV 

z[{tJ}]^Y[i^-^'' + ^~^^') ' (101) 

i=l 



and the average over the disorder gives 
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= ln2-^+ f: e-°/,(a) ln(|cosh(^^jj , (102) 

where Ii denotes the modified Bessel function. The zero temperature hmit 
gives the ground state energy density 

Oi 

easia) = -[1 - e-"7o(a) - e'^hia)] (103) 



and the ground state entropy density 

SG5(a) = e-%(a) In 2 . (104) 



For any a > 0, the ground-state energy density is positive and therefore the 
overall Boolean formula is false with probability one. Also, the entropy density 
is finite, i.e., the number of minima of the energy for any a is exponentially 
large. Such a result can be understood by noticing that there exist a fraction 
of unconstrained variables e~"/o(Q;) which are subject to equal but opposite 
constraints ti — fi. 

The above results are recovered in the statistical mechanics framework, thereby 
showing that the RS Ansatz is exact for all /3 and a when K — 1. 

The solution of the saddle-point equation (98) can be found for any tempera- 
ture T leading to the expression 

P(m) = £ e^"/^(a) 5 - tanh {^^^ ■ (105) 

In the limit of interest /5 — > oo, this formula reads 

P[m) = e"'Io{a)6{m) + ^(1 - e-"/o(a)) {S{m - 1) + 6{m + 1)) . (106) 



As shown in figure 3, the fraction of unconstrained variables is simply asso- 
ciated with the unfrozen spins and thus gives the weight of the 5-function 
at m = 0. On the contrary, the non-zero value of the fraction of violated 
clauses, proportional to the ground-state energy density, is due to the pres- 
ence of completely frozen (over constrained) spins of magnetizations m = ±1. 
Such a feature remains valid for any K. 
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Fig. 3. Energy density (bold line) and entropy density (thin line) versus a in a 
random 1-SAT formula, in the limit N — > oo. 

4-5 Sat phase: structure of the space of solutions. 

We start by considering the sat phase. An interesting quantity to look at is the 
typical number of solutions of the random K-SAT problem; this quantity can 
be obtained from the ground state entropy density sgs{c() given by eq.(lOO) 
in the /3 — > oo limit. 

In the absence of any clauses, all assignments are solutions: sgs{c( = 0) = 
In 2. We have computed the Taylor expansion of sgs{c() in the vicinity of 
q; = 0, up to the seventh order in a. Results are shown in Figure 4. It is 
found that sasidc = 1) = -38 and SGs{a = 4.2) = .1 for 2-SAT and 3-SAT 
respectively: just below threshold, solutions are exponentially numerous. This 
result is confirmed by rigorous work [37] . 

More involved calculations, including replica symmetry breaking (RSB) effects 
[28] , have shown that the value of the entropy is insensitive to RSB in the sat 
phase. Therefore the RS calculation provides a quite precise estimate of the 
entropy (believed to be exact at low a ratios, see Talagrand's paper in this 
volume for a discussion). 

Recent analytical calculations for 3-SAT [28] (also confirmed by numerical in- 
vestigations) indicate that the RS theory breaks down at a definite ratio ajisB 
below ac, where the solutions start to be organized into distinct clusters. The 
meaning of this statement is as follows. Think of the space of spins configu- 
rations as the A'"-dimensional hypercube. Optimal assignments are a subset of 
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Fig. 4. RS estimate for the entropy density in random 2-SAT and 3-SAT below 
their thresholds. RSB corrections due to clustering are absent in 2-SAT and very 
small (within few a percent) in 3-SAT. The dots represent the results of exact 
enumerations in small systems {N ranging from 20 to 30, see ref. [22]) 

the set of 2^ vertices on the hypercube. Replica symmetry amounts to assum- 
ing that any pair of vertices are a.s. separated by the same Hamming distance 
d, defined as the fraction of distinct spins in the corresponding configurations. 
In other words, solutions are gathered in a single cluster, of diameter dN. 
RSB variational calculations [28] show that this simplifying assumption is not 
generally true in the whole sat phase and that another scenario may take place 
close to threshold: 

• Below aRSB the space of solutions is replica symmetric. There exist one 
cluster of solutions characterized by a single probability distribution of local 
magnetizations. The Hamming distance o? is a decreasing function of a, 
starting at d{0) = 1/2. 

• At OiRSB — 4.0, the space of solutions breaks into a large number (poly- 
nomial in A^) of different clusters. Each cluster contains an exponential 
number of solutions. The typical Hamming distance do between solutions 
belonging to different clusters is close to 0.3 and remains nearly constant 
(it is slightly decreasing) up to ac, indicating that the centers of these clus- 
ters do not move on the hypercube when more and more clauses are added. 
Within each cluster, solutions tend to become more and more similar, with 
a rapidly decreasing intra-cluster Hamming distance di. 

Figure 5 provides a qualitative representation of the clustering process. The 
fact that the Hamming distance can take two values at most is a direct conse- 
quence of the RSB Ansatz. In reality, the distance distribution could be more 
complicated. The key point is that statistical mechanics calculations strongly 
support the idea that the space of solutions has a highly organized structure, 
even in the sat phase. 
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Fig. 5. Variational RSB estimate for the clustering of solutions below Qc for 3 — SAT. 
d is the typical Hamming distance between solutions. The splitting of the curves 
at a ~ 4 corresponds to clustering. There appear two characteristic distances, one 
within each cluster and one between solutions belonging to different clusters. 

Recently, the exact solution of the balanced version of random K-SAT [38] 
has provided a concrete example in which the appearance of clustering before 
the sat/unsat transition can be studied both analytical and numerically. Note 
that this phenomenon is strongly reminiscent of what happens in some formal 
multi-layer neural networks models [5]. 

4-6 Unsat phase: the backbone and the order of the phase transition. 

In the unsat phase, it is expected that 0{N) variables become totally con- 
strained, i.e. take on the same value in all the ground states. Such a hypothesis, 
which of course needs to be verified a posteriori, corresponds to a structural 
change in the probability distribution P{m) which develops Dirac peaks at 
m = ±1. 

In the hmit of interest (T — > 0), to describe the accumulation of the magne- 
tization on the borders of its domain (m G [—1; 1]), we introduce the rescaled 
variable z, implicitly defined by the relation m = tanh(2;/T), see equation 
(106). Calling R{z) the probability density of the zs, the saddle-point equa- 
tions read 



R{z) 




— cos[uz) exp 




+ aKx 



(107) 



— oo 




The corresponding ground state energy density reads, see (100), 
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eosia) = a{l- K) / dziR{ze)mm{l, Zi, . . . , zk) 
^=1 

aK 7^-1 7 
+ — y n dztR{zi)jmvL{l, ^1, . . . , zx-i) - j dzR{z)z . (108) 



It is easy to see that the saddle-point equation (107) is in fact a self-consistent 
identity for R{z) in the range z e [0, 1] only. Outside this interval, equation 
(107) is merely a definition of the functional order parameter R. 

As discussed in detail in ref. [23], equations (107) admit an infinite sequence 
of more and more structured exact solutions of the form 



R(z)^ J2 reSiz-- 

l=-oo \ ^/ 



(109) 



having exactly q peaks in the interval [0, 1[, whose centers are ze = jj, i = 
0, . . . ,q — 1. The corresponding energy density reads, from (109) and (108), 




(110) 



1=1 



Though there might be continuous solutions to (107), it is hoped that the 
energy of ground state can be arbitrarily well approximated by the above 
large q solutions. 

The location of the sat/unsat threshold can be obtained for any K by looking 
at the value of a beyond which the ground state energy becomes positive. For 
2 — SAT the exact result ac{2) = 1 is recovered whereas for X > 2 the RS 
energy becomes positive at a value of a (e.g.. ac(3) ~ 4.6 as shown in figure 6) 
which is sightly higher than the value estimated by numerical simulations. 



4-6.1 A hint at replica symmetry breaking. 

The RS theory provides an upper bound for the thresholds for any K > 2, 
whereas the exact values can be obtained only by adopting a more general 
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Fig. 6. RS estimate for the ground state energy density, i.e., the number of violated 
clauses divided by N in random 3-SAT. The prediction is given as a function of a, 
for g S> 1 and in the limit N ^ oo. See ref. [23] for details. 

functional form for the solution of the saddle-point equations which explicitly 
breaks the symmetry between replicas (see ref. [27] for a precise discussion). 
Such an issue is indeed a relevant, and largely open, problem in the statistical 
physics of random systems [39-46] . 

The general structure of the functional order parameter which describes so- 
lutions that break the permutational symmetry among rephcas consists of a 
distribution of probability densities: each Boolean variable fluctuates from one 
cluster of solutions to another, leading to a site dependent probability density 
of local Boolean magnetizations. The distribution over all different variables 
then provides a probability distribution of probability distributions. The above 
scheme can in principle be iterated, leading to more and more refined levels 
of clustering of solutions. Such a scenario would correspond to the so-called 
continuous RSB scheme [1]. However the first step solution could suffice to 
capture the exact solution of random K-SAT, as happens in other similar 
random systems [1]. 



4-6.2 Abrupt vs. smooth phase transition. 

Of particular interest are the fully constrained variables - the so called back- 
bone component -, that is the XiS such that = ±1. Within the RS Ansatz, 
the fraction of fully constrained variables 7(0;, K) can be directly computed 
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Fig. 7. Numerical estimates of the value of the backbone order parameter in 2-SAT 
and 3-SAT. The curves [25] are obtained by complete enumerations in small systems 
(up to iV = 500 variables for 2-SAT and AT = 30 for 3-SAT) averaged over many 
samples. 

from the saddle-point equations. Clearly, 7(0;, K) vanishes in the SAT region 
otherwise the addition of eA'" new clauses to F would lead to a contradiction 
with a finite probability for any e > 0. Two kinds of scenarii have been found 
when entering the unsat phase. For 2-SAT, 7(0;, 2) smoothly increases above 
the threshold Q!c(2) = 1. For 3-SAT (and more generally K > 3), 7(0;, 3) ex- 
hibits a discontinuous jump to a finite value 7c slightly above the threshold. A 
finite fraction of variables become suddenly over constrained when crossing the 
threshold! Numerical results on the growth of the backbone order parameter 
are given in figure 7. 



4.6.3 The random 2+p-SAT model. 

The sat/unsat transition is accompanied by a smooth (respectively abrupt) 
change in the backbone component and therefore in the structure of the solu- 
tions of the 2-SAT (resp. 3-SAT) problem. A better way to understand how 
such a change takes place is to consider a mixed model, which continuously 
interpolates between 2-SAT and 3-SAT. The so-called 2 -|- p-SAT model [25] 
includes a fraction p (resp. 1—p) of clauses of length two (resp. three). 2-SAT 
is recovered for p = and 3-SAT when p = 1. The RS theory predicts that, 
at the sat/unsat transition, the appearance of the backbone component be- 
comes abrupt when p > po — 0.4 (see figure 8). On the contrary, when p < po, 
the transition is smooth as in the 2-SAT case. Such a scenario is consistent 
with both rigorous results (see the paper by Achlioptas et al. in this volume) 
based on the probabilistic analysis of simple algorithm and with variational 
calculations [28] which include RSB effects. 
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Fig. 8. adp) versus p in random 2+p-SAT. Up to po — A ac{p) = —p), in 
agreement with rigorous results. For p > po the transition becomes discontinuous 

in the backbone order parameter and the RS theory provides an upper bound for 
adp) which is within a few percent of the results of numerical simulations (dots) 
[25,26]. 

An additional argument in favor of the above picture is given by the analysis 
of the finite-size effects on P;v(a, K) and the emergence of some universality 
for p < Pq. (The definition of P/v was given when wc began discussing the 
properties of K-SAT.) A detailed account of these findings may be found 
in [25,26]. For p < po the size of the critical window where the transition 
takes place is observed to remain constant and close to the value expected for 
2-SAT. The critical behaviour is the same as for the percolation transition in 
random graphs (see also ref. [47]). For p > po the size of the window shrinks 
following some non-universal exponents toward its statistical lower bound [48] 
but numerical data do not allow for any precise estimate. The balanced version 
of 2-|-p-SAT can be studied exactly and both the phase diagram and the critical 
exponents turn out to behave very similarly to the ones of 2 -|- p-SAT [49] . 

As we shall conclude in the next section, the knowledge of the phase diagram 
of the 2-|-p-SAT model is very precious to understand the computational com- 
plexity of 3- SAT solving. 
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^.7 Computational complexity and dynamics. 

Numerical experiments have shown that the typical solving time of search 
algorithms displays an easy-hard-easy pattern as a function of a with a peak of 
complexity close to the threshold. Since computational complexity is strongly 
affected by the presence of a phase transition, it is appropriate to ask whether 
the nature of this phase transition plays an important role too. The peak 
in the search cost seems indeed to scale polynomially with N (even using 
Davis- Putnam-like procedures) for the 2-SAT problem, where the transition 
is continuous, and exponentially with N in the 3-SAT case, for which the birth 
of the backbone is known to be discontinuous. 

Precise numerical simulations [25,26] on the computational complexity of solv- 
ing critical 2-|-p-SAT instances support the view that the crossover between 
polynomial and exponential scalings takes place at po, the very value of p sep- 
arating continuous from discontinuous transitions. Though investigated 2+p- 
SAT instances are all critical and the problem itself is NP-complete for any 
p > 0, it is only when the phase transition is abrupt that hardness shows 
up (including the fastest known randomized search algorithms such as walk- 
sat [50]). 

To understand why search algorithms require polynomial or exponential com- 
putational efforts, statistical studies of the solutions cannot be sufficient. A 
full dynamical study of how search procedures operate has to be carried out. 
Such studies had already been initiated by mathematicians in the easy region, 
where search tree are particularly simple and almost no backtracking occurs. 
Franco and Chao [51] have in particular analyzed the operation of DP algo- 
rithms with different kinds of heuristics and have shown that at small values 
of a the typical complexity is linear in N. 

Recently, the whole range of values of a, including the hard phase, has been in- 
vestigated, using dynamical statistical mechanics tools [52] . During the search 
process, the search tree built by DP grows with time and this growth process 
can be analyzed quantitatively. The key idea is that, under the action of DP, 
3-SAT instances are turned into mixed 2+p-SAT instances (some clauses are 
simplified into clauses of length two, other are satisfied and eliminated). The 
parameters p and a of the instance under consideration dynamically evolve 
under the action of DP. Their evolution can be traced back as a trajectory in 
the phase diagram of the 2-|-p-SAT model of figure 8. Depending on whether 
trajectories cross or not the sat/unsat boundary, easy or hard resolutions take 
place, and the location of crossings can be used to quantitatively predict the 
scaling of the resolution times [52]. 
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5 The traveling salesman problem and the cavity method 

In Section 3, we derived partition functions using statistical pliysics repre- 
sentations based on analytic continuations. Furthermore, we used the saddle 
point method on these partition functions and that allowed us to reproduce 
a number of exact results. Then we moved on in Section 4 and applied these 
methods to models with quenched disorder. However, because of the greater 
complexity of such models, we resorted to an additional tool of statistical 
physics: the replica method. Though this kind of approach is non-rigourous, 
it is believed that it provides new exact results for a number of different prob- 
lems, in particular in optimization. 

The replica method is not the only technical tool that physicists have devel- 
oped in the past years. Another approach, called the cavity method, will be 
exposed in the present Section. The cavity approach gives, at the end of the 
computation, the sames results as the replica approach. Yet the assumptions it 
relies upon turn out to be much more intuitive and its formalism is closer to a 
probabilistic theory formulation. Because of this, it can be used to prove some 
of the results derived from statistical mechanics; see [53,54] for recent progress 
in this direction. In the rest of this section, we show how this cavity method 
can be used to "solve" a case of the Travehng Salesman Problem (TSP). 

The TSP is probably the world's most studied optimization problem. As usu- 
ally formulated for a weighted graph, one considers all Hamiltonian cycles 
or "tours" (closed circuits visiting each vertex once and only once) and asks 
for the shortest one. The total length is given by the sum of the weights or 
"lengths" of the edges making up the tour. Since the Hamiltonian cycle prob- 
lem is NP-complete, certainly the TSP is very difficult. However, in most cases 
considered, the graph is complete (there is an edge for each pair of vertices), 
so the difficulty lies in determining the shortest tour. Without further restric- 
tions on the nature of the graph, the TSP is iVP-hard [15]. One speaks of the 
asymmetric TSP when the edges on the graph are oriented, and of the sym- 
metric TSP for the usual (unoriented) case. Both types are frequently used 
models in scheduling and routing problems, though the industrial applications 
tend to move away from the simple formulations considered in academia. The 
symmetric TSPs are further divided into "metric" and non-metric according 
to whether or not the triangle inequality for the edge lengths is satisfied. The 
so-called Euclidean TSP is probably the best known TSP and it is metric; the 
vertices are points (cities, or sites) in the plane, and the length of the edge 
connecting cities i and j is given by the Euclidean distance between i and j. 
Even within this restricted class of weighted graphs, the problem of finding 
the optimum tour remains NP-hard [15]. 

The TSP has been at the forefront of many past and recent developments 
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in complexity. For instance, pretty much all general purpose algorithmic ap- 
proaches have been first presented and tested for the TSP. This tradition be- 
gins back in 1959 when Beardwood et al. [55] published tour lengths obtained 
from hand-drawn solutions! Later, the idea of optimization by local search 
was introduced in the context of the TSP by Lin [56] , and simulated anneal- 
ing [57,58] was first tested on TSPs also. The list continues with branch and 
bound [59], until today's state of the art algorithms based on cutting planes 
(branch and cut) [60], allowing one to solve problems with several thousand 
cities [61]. Many physicists have worked on these kinds of algorithmic ques- 
tions from a practical point of view; in most cases their algorithms incorporate 
concepts such as temperature, mean field, and renormalization, that are stan- 
dard in statistical physics, leading to some of the most effective methods of 
heuristic resolution [62] . It might be argued that these approaches can also be 
used to improve the heuristic decision rules at the heart of exact methods (for 
instance in branching strategies) , but more work has to be done to determine 
whether this is indeed the case. 

The widespread academic use of the TSP also extends to other issues in com- 
plexity. For instance, there has been much recent progress in approximability 
of the TSP [63]. However statistical physics has nothing to say about worst 
case behavior; instead it is relevant for describing the typical behavior arising 
in a statistical framework and tends to focus on self-averaging properties. Thus 
we are lead to consider TSPs where the edge lengths between vertices are cho- 
sen randomly according to a given probability distribution; the corresponding 
problem is called the stochastic TSP. 



5.1 The stochastic TSP. 



Statistical physicists as well as probabilists arc not interested per-se in any 
particular instance of the TSP, rather they seek "generic" properties. This 
might be the typical computational complexity or the typical length of TSPs 
with N cities. It is then necessary to consider the stochastic TSP where each 
instance (the specification of the weighted graph) is taken at random from an 
ensemble of instances; this defines our "quenched disorder" . Although one may 
be interested in many different ensembles, only a few have been the subject 
of thorough investigation. Perhaps the most studied stochastic TSP is the 
Euclidean one where the cities are randomly distributed in a given region of the 
plane [55] . This is a "random point" ensemble. Another ensemble that has been 
much considered consists in having the edge lengths all be independent random 
variables, corresponding to a "random distance" ensemble. (This terminology 
is misleading: the problem is not metric as the triangle inequality is generally 
not satisfied.) Random distance ensembles have been considered for both the 
symmetric [64] and the asymmetric [65] TSP. 
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For any of these ensembles, one can ask for the behavior of the optimum 
tour length, or consider properties of the tour itself. Most work by proba- 
bilists has focused on the first aspect (see [14] for a review), starting with the 
seminal work of Beardwood, Halton, and Hammersley [55] (hereafter referred 
to as BHH). Those authors considered the Euchdean ensemble where points 
are randomly (and independently) distributed in a bounded region Q of d- 
dimensional Euclidean space according to the probability density p(X). Given 
a not too singular p, BHH proved that the optimum tour length, Le, becomes 
peaked at large A^, and that with probability one as N ^ oo 

J^a - m I p'-'^'{X)dX (111) 



Here is a constant, independent of p, depending only on the dimension of 
space. Some comments are in order. The first is that the relative fluctuations 
of the tour length about its mean tend to zero as ^ oo, allowing one to 
meaningfully define a "typical" or generic tour length at large N. This fun- 
damental property was initially proven using sub-additivity properties of the 
tour length, but from a more modern perspective, it follows from considering 
the passage from to + 1 cities, corresponding to a martingale process 
(see [66]). The second point is that the A^ dependence of this typical length 
is such that the rescaled length Le/N^^^^'^ converges in probability at large 
A^. In the language of statistical physics, this quantity is just the ground state 
energy density of the system where one increases the volume linearly with 
A^ so that the mean density of points is A^-independent. In general such an 
energy density is expected to be self-averaging, i.e., have a well defined large 
A^ limit, independent of the sequence of randomly generated samples (with 
probability one) as in Eq.(lll). In some problems, the self-averaging property 
can be derived, while it will simply be assumed to hold when using the cavity 
approach. 

Another comment is that given Eq.(lll), the essence of the problem is the 
same for any p(X); it is thus common practice to formulate the Euclidean 
TSP using A^ points laid down independently in a unit square (or hypercube 
if (i > 2), the distribution being uniform. 

There has been much work [14] on obtaining bounds and various estimates 
of the constants /3(d), but no exact results are known for d > 1. However, 
Rhee [67] has proved that 

1 as d-oo (112) 
vet v2e7r 



Prom the point of view of a statistical physics analysis, the difficulty in com- 
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puting f3{d) arises from the correlations among the point to point distances. 
Indeed, in the Euchdean ensemble, there are dN random variables associated 
with the random positions of the points, and N{N — l)/2 distances; these dis- 
tances are thus highly redundant (and a fortiori correlated) . When these dis- 
tances are instead taken to be random and independent, the "cavity" method 
of statistical physics allows one to perform the calculation of the corresponding 
(3. Because of this, we will focus on that quenched disorder ensemble. 

In the "independent edge-lengths ensemble" (as opposed to the independent 
points ensemble), it is the distances or edge lengths between points that are 
independent random variables. Let dij be the "distance" between points i and 
j (the problem is not metric, but we nevertheless follow the standard nomen- 
clature and refer to dij as a distance). In the most studied case, dij is taken 
from a uniform distribution in [0, 1]. From a physicist's perspective, it is nat- 
ural to stay "close" to the Euchdean random point ensemble [64] by taking 
the distribution of d^j to be that of two points randomly distributed in the 
unit square (hypercube when d > 2). The independent points and indepen- 
dent edge-lengths ensembles then have the same distribution for individual 
distances, and in the short distance and large N limit they also have the same 
distribution for pairs of distances. The main difference between the ensembles 
thus arises when considering three or more distances; in the Euclidean case, 
these have correlations as shown for instance by the triangle inequality. 

The minimum tour length in these random edge-lengths models is expected 
to be self-averaging; the methods of Rhee and Talagrand [66] show that the 
distribution of TSP tour lengths becomes peaked at large N in this case, but 
currently there is no proof of the existence of a limit as in the Euclidean case. 
Nevertheless, this seems to be just a technical difficulty, and it is expected that 
the rescaled tour length indeed has a limit at large N; we thus define P{d) in 
analogy to the expression in Eq.(lll) with the understanding that the f3s are 
different in the independent points and independent edge-lengths ensembles. 

5.2 A statistical physics representation. 

Following the notation of Section 2, we introduce the generating or partition 
function 

Z{T)^j:eM-^) (113) 

where a is a permutation of the vertices and determines uniquely a tour. In 
effect we have identified configurations with tours, that is with permutations; 
furthermore, the energy of a configuration is simply the length of its tour. 



52 



This construction amounts to introducing a probability e~^^'^^l'^ jZ for each 
tour. When T = oo, all tours are equally probable, while when T ^ only 
the shortest tour(s) survive. As before, T is the temperature, and the averages 
(.)t using this probability distribution are the thermal averages. Prom them 
one can extract most quantities of interest. For instance 



gives the mean tour length at temperature T. We then have for the TSP tour 
length: L„j„ = hm^^o < >r- 

The generating function Z requires performing a sum over all permutations 
and is a difficult object to treat. To circumvent this difficulty, a different 
representation is used. We first introduce what is called a "spin" S, having 
now m-components, 5"^, a — 1, ...,m. These components are real and satisfy 
the constraint Y^J\S°'Y' — Such a spin can be identified with a point on a 
sphere in m-dimensional Euclidean space. Note that when m = 1, we recover 
the kinds of spins considered in the previous sections. Now for our statistical 
physics representation of the TSP, a spin Sj is associated to each vertex of 
the graph, i = 1, N . Define Rij — e~*^/^ and introduce a new generating 
function 

G{T,m,uj) = / dSidS2---dSN exp(a; y^^Rjj Sj ■ Sj) (US) 



In this expression, • is the usual scalar product, and dS is associated with the 

uniform measure on the sphere in dimension m. We have normalized it so that 
J dS = 1; then / dSS'^S^ = 6a,i3- The claim is now that the initial generating 
function Z is equivalent to using an analytic continuation of G in m: 

G — 1 .r-^ , Licr). 
lim = y exp ^ 116 



Comparing to the Potts model of Section 3, we see that m is analogous to the 
Potts parameter q: the partition function is defined for integer values of the 
parameter, and then has to be analytically continued to real values. 

The derivation of equation (116) is based on showing the equality of both sides 
when performing a power series in 1/T. First expand the exponential in the 
integral: 



^ — j dSidS2---dSN 



l+^^i?,,(S,-S,) + |- 

i<j 



;ii7) 
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Now integrate term by term; each resulting contribution can be associated with 
a subgraph (but where edges can appear muhiplc times) whose weight is given 
in terms of its edges and its cycles. (Note that each vertex must be covered 
an even number of times because the integrand is even under Sj — > — Sj.) 
Each edge Eij appearing in the subgraph contributes a multiplicative factor 
Rij to its total weight. A further factor comes from the loops (cycles) of the 
subgraph. It is not difficult to sec that each such loop leads to a factor m 
in the total weight because of the integration over the m-dimensional spins. 
Thus as m — > only subgraphs having a single loop survive in G and then 
vertices cannot belong to more than two edges. Finally, when a; ^ oo, the 
loops with the most vertices dominate, leading to tours. Thus if we first take 
m — » and then cj — oo, the expansion of G — 1 reduces to a sum over all 
the tours of the graph. Furthermore, the weight of each tour is proportional 
to the product of the Rij belonging to the tour, so that one recovers the total 
weight mu;^ e:K.p{—L/T) where L is the tour length. In conclusion, Eq. (116) is 
justified to all orders in l/T, and thus for any finite N it holds as an identity. 

Whether one uses Z or G — 1 does not matter as they differ only by an 
irrelevant multiplicative factor (we assume m and l/cu infinitesimal). From 
G — 1, one can compute the optimum tour and not just the optimum tour 

length; indeed, at finite temperature, the probability that a tour contains the 
edge Eij is given by the mean occupation of that edge. Defining riij = 1 if the 
edge is used by the tour and riij = otherwise, the probability of occupation 
is 



where from now on {.)t means thermal average using either Z or G — 1; 
the one that is used should be clear from the observable considered. Now if 
we take in Eq. 118 the limit T — > 0, we find those edges that are occupied 
and thus the optimal tour (assuming it is unique). Note also that Eq. (118) 
has a simple justification: (Sj • Sj)r has a numerator whose expansion gives 
muj^~^ / Rij times the weighted sum over all tours containing the edge ij, while 
the denominator is muj^ times the weighted sum over all tours. The identity 
Eq. (118) then follows immediately. 

5.3 The cavity equations. 

The partition function G — 1 gives the "statistical physics" of the TSP for 
any given graph. Using this formalizm to determine analytically the optimum 
tour in a general case seems an impossible task. Nevertheless, G is a good 
starting point for following the passage from to iV + 1 vertices as in a 
martingale process, and the derivation of a recursion in N is the heart of 




(118) 
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Fig. 9. (A^ + l)th spin and its ordered neighbors. 



the cavity method. The term cavity comes from the fact that the system at 
A'" + 1 is compared to the one at N by removing the {N + l)th spin, thereby 
creating a cavity. In figure 9, we have represented in counter-clockwise order 
the nearest, next-nearest, etc... neighbors of site + 1 which is at the center 
of the cavity. Because the total number of spins will be sometimes N and 
sometimes + 1, we indicate the number via a subscript on G. Thus for 
instance Gat — 1 is to be used when considering quantities for the system with 
A^ spins. Now for every quantity associated with the system having A^ -|- 1 
spins, if we integrate explicitly over spin A^ + 1, we are left with quantities 
defined in the system having only A^ spins. Consider for instance Gn+i — 1 
itself. When expanding the exponentials depending on Sjv+i, we obtain: (i) 
terms linear in S^+i that integrate to zero; (ii) terms quadratic in Sat+i that 
upon integration give products Si ■ Sj; (iii) higher powers in Sjv+i that do not 
contribute as m — > 0. A simple calculation leads to the identity 



G n+1 1 _ 2 V- p D /c Q \' _ -^^+1 

- ^ l<j<k<N 



Gr 



where (.)^ is a "cavity average" , to be taken in the system having only the first 
A^ spins, spin A^ + 1 being absent. Note that and ^at+i are the partition 
functions of Eq. 113 when there are N and A^-|- 1 vertices; also, it is easy to see 
that one need not restrict the sum to j ^ k because the term j — k vanishes 
as m — > 0. 

Straightforward calculations in this same spirit lead to relations between ther- 
mal expectation values using A'^-l-l spins and those using A" spins. For instance 

N 

(Sjv+i)t {Gn+1 - 1) = ^ujRj,N+i{Sj)'T {Gn - 1) (120) 
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Similarly, one has for the two-spin average: 



(Sat+i • Sj)T (Gat+i ~ 1) — ^^^^j,N+i (Si • Sj)rp (G^r — 1) (121) 



More generally, the numerator in any observable depending on spin + 1 has 
a simple expression in terms of the numerators of obscrvables in the absence 
of that spin. Furthermore, one can use Eq. 119 to eliminate all reference to 
Gn and Gat+i in these relations. The conclusion is that if we know how to 
compute the properties of systems with N spins, we can then deduce those of 
systems with N + 1 spins; the cavity method is thus a recursion on N for all 
the properties of such a system. 

5.4 The factorization approximation. 

Unfortunately, these recursion equations cannot be solved, but let us approxi- 
mate them by neglecting certain correlations. Clearly, Sjv+i is strongly corre- 
lated with its nearest neighbors because the corresponding Rs are important. 
More generally, two spins whose joining edge length is short (are near neigh- 
bors) will be strongly correlated because short tours will often occupy that 
edge. Thus we must and will take into account the correlations between Sat+i 
and its near neighbors. However, we will neglect here the correlations among 
these neighbors themselves, so that in the absence of Sat+i, their joint proba- 
bility distribution factorizes, so that in particular 



This property imphes that replica symmetry is not broken, and this is indeed 
believed to be the case for the TSP. Factorization makes the cavity approach 
particularly tractable, as we shall soon sec. (In systems where replica symme- 
try is broken, it is necessary to find ways to parametrize these correlations; 
this is quite complex and not well resolved, even within the statistical physics 
approach.) 

A second point concerns the meaning of (SAf+i)r- Gat+i is rotationally sym- 
metric; there is no preferred direction, so the thermal average of any spin 
vanishes. Note however that we have seen a similar situation before in the 
context of the Ising model (c.f. Section 2). Here as before, the interactions 
tend to align the spins. Thus, when the temperature is low enough, we ex- 
pect to have a spontaneous magnetization when — ^ 00. To make this more 
explicit, we can introduce a small magnetic field, i.e., an interaction term of 
the type — h • Si for each spin; we then take the limit N ^ 00 and only after 
take h — > 0. This magnetic field breaks the rotational symmetry, and so the 




(122) 
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system has a preferred direction, even after the field has been removed. By 
convention, we shall take this direction to be along the first axis. 



Given these two remarks, we can use the exact equations (120) and (121) 
to obtain the cavity equations assuming factorization. Denoting by the 
component along the first axis of S, one has 



^'^^ ^ J2l<j<k<N Rj,N+lRk,N+l {Sj)^ {SDt 



(123) 



Similarly, one has for the two-spin average (see Eq. 118): 

\ni,N+l)T - Ki,N+l\^i )t ^ o D /Ql\' /QlV ^^"^^f 

2^1<j<k<N ■'^j,N+l-rik,N+l\>^j /T Wfe/T 



These are the standard cavity recurrence equations, first derived by Mezard 
and Parisi [68]. We also note that in this factorization approximation, one has 
(Siv+i • Si)r = {Sn+i)t{SI)t 



5.5 The N ^ oo and T ^ limits. 



The last step of the cavity method is to assume that the recurrence equations, 
when considered in the disorder ensemble, give rise to a stationary stochastic 
process when — > oo. Consider for instance the individual magnetizations 
{Si)T', they are random variables because the dij themselves are. If we want 
them to have a limiting distribution at large A^, (i.e., in physical terms, to 
have a thermodynamic limit), we have to rescale the dij by N^/'^ or equiva- 
lently set T = TN-'^/'^ with f fixed. (Note that in the case of the Euchdean 
TSP, the rescaling of lengths can be interpreted as taking the limit N ^ oo 
while keeping the density of points fixed, that is by increasing the size of the 
volume fl linearly with N.) The important point is that the "environment" 
seen by the spins must have limiting statistical properties as N ^ oo, and this 
translates to having A'"-independent statistics for the distances of a spin to its 
near neighbors. Then it is assumed that the probability density of the {S})t 
converges to a limiting distribution Poo when N ^ oo. The cavity method 
is thus a kind of bootstrap approach where Poo is assumed to exist and it is 
determined by its stationarity property under the cavity recurrence. 

That such a stationary limit exists can be motivated by the large N behavior of 
the tour length in the stochastic TSP. In fact, it is expected that all quantities 
associated with any fixed number of edges will converge in the thermodynamic 
limit, so it should be possible to look at 2, 3, or k edge constructs. At present 
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though, because of the technical difficulty, only the single edge computations 
have been carried out. Fortunately, that is enough for getting the value of 
/9, and allows one to obtain the so called link-length distribution, i.e., the 
distribution of the edge lengths appearing in the optimal tours. 

Equation (123) with the condition of stationarity of the stochastic process 
leads to a complicated implicit equation for Pqo- Fortunately, in the zero tem- 
perature limit (which is where we recover the usual stochastic TSP), the re- 
currence relations are much simpler. Following Krauth and Mezard [69], one 
defines (pi for any vertex i {i — 1, ...,N) via: 

(S/)' = (125) 



One also defines 4>n+i analogously using Now re-order the indices of 

the first vertices so that 

N'/''di,N+i - 01 < N^/''d2,N+i - 02 < ... < N^^''dN,N+i - (t>N (126) 



Then the zero-temperature hmit of Eq. 123 leads to 

0iv+i = c^2,iv+iA^'/'' - 02 (127) 



while Eq. 124 shows that the optimum tour uses the edges connecting -|- 1 
to vertices 1 and 2, i.e., rii^jv+i = n2,N+i — 1, ah others are equal to zero. 

If we have a stationary stochastic process, Eq. (127) leads to a self-consistent 
equation for the probability density P of the 0s. We also see that the random 
variables Xi = N^/'^di^N+i — <Pi {i = play a fundamental role. By 

hypothesis, they are uncorrelated: the rfj jy+i because we are dealing with the 
independent edge-lengths ensemble, and the 0j because we have explicitly 
neglected the correlations between the spins in the absence of Sat+i. Denote 
by n(x) tlic probability density of these random variables; n(x) is uniquely 
determined in terms of P, assuming the distribution of rfj^Ar+i given. From here 
on, take for simplicity these edge lengths to be uniformily distributed in [0, 1]. 
(This corresponds to the 1-dimensional case d = 1; we refer the reader to [69] 
for more general distributions.) The relation between 11 and P then becomes 

1 ^ 

Ii{x) = -j P{l-x)dl (128) 





Now a self-consistent equation for P is obtained by using the fact that 0Ar+i 
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is the second smallest of the N different xs: 



-oo +00 



P(0) = A^(Ar-l)n(0)( J U{u)du){ J U{u)du)^-^ (129) 



In the large N limit, this integral non-linear implicit equation simplifies to 

P(0) = ^(0) e-«W where ^(0) = J uP{u - <P)du (130) 



Plugging the expression for P into this last equation leads to 

+00 

G{(l)) = / [1 + G{t)\ e-^W dt (131) 



This cannot be solved analytically, but can easily be treated numerically, and 
one can obtain machine precision results for G and thus P without too much 
effort. 

Assuming G and P have been computed, one can find in a similar way the 
distribution of di^N+i and d2,N+i- For instance, the distribution of the rescaled 
distance Ndi^N+i — h is given by 

Pi(/i)= I P{k-x)e-''^^^dx (132) 



This, along with the analogous distribution for (i2,Af+i, gives the distribution 
of edge lengths in the optimum tour, and thus also the mean tour length, i.e., 
when d = 1, the value of /?. Krauth and Mezard [69] showed that this constant 
could be written in terms of G alone, 

+00 

/3 = - y G{t) [1 + G{t)] e-^W dt (133) 
—00 



and they found f3 = 2.041. . . (Note that when d = 1, as suggested by Eq. (Ill), 
the tour length becomes independent of A^. This can be understood qualita- 
tively by observing that each vertex can connect to one of its near neighbors 
that is at a distance 0{1/N).) 



59 



5.6 "Exact" solution in the independent edge-lengths ensemble. 

As described, the cavity method involves an uncontrollable approximation 
associated with ignoring certain correlations. It is natural to ask whether those 
correlations might in fact be absent in certain ensembles. A simple case is when 
the graph considered is a Caylcy tree with the root (corresponding to vertex 
A''+ 1) removed. Then the different neighbors of S^+i are uncoupled and have 
no correlations at all. Unfortunately, this type of graph will not do for the 
TSP as it has no Hamiltonian cycles, but it can do for other problems close 
to the TSP such as the minimum matching problem. 

So let us consider instead the structure of independent edge-lengths graphs. 
Locally their properties ressemble those of Cayley trees, so that with some 
luck the previous reasoning can hold for these types of graphs as A?" — > oo. 
Although the correlations that were neglected in the cavity calculation will 
always be present at finite in the independent edge-lengths model, they 
have every reason to go to zero as A^ — > oo. The justification is that the close 
neighbors of vertex A^-|- 1 are "infinitely" far from one-another when N ^ oo. 
In the language of tours (rather than spins), this means that the probability 
for the tour to have an edge connecting two of the finite order neighbors of 
vertex A^ + 1 should go to zero at large A^. Clearly this is not the case in 
the Euclidean stochastic TSP because of the triangle inequality: the neighbor 
of a neighbor is itself a neighbor. But in the independent edge-length model, 
the neighbors represented in figure 9 are "far away" from one-another with a 
probability tending towards 1 as N ^ oo. This kind of random "geometry" is 
then expected to lead to uncorrelated spins among the finite order neighbors 
of Sjv+i and so the cavity calculation may become exact as N —>■ oo. 

Although it is not clear yet that the correlations go away as A" — > oo in the 
independent edge-lengths ensemble, the reasoning above is supported by ex- 
tensive simulational results. In these kinds of tests, one generates weighted 
graphs in the ensemble of interest, determines the optimum tour for different 
sizes A^, and then estimates the statistical properties in the large A^ limit. 
All such simulational studies to date have confirmed the validity of the cavity 
method. Both the assumptions of no replica symmetry breaking [70] and the 
predictions for f3 and P{di^N^i) have been validated [69,71,70] in that way. 
Although these tests have limited precision in the context of the TSP, more 
stringent tests [72,73] have been performed on matching problems. For in- 
stance, using the cavity and rephca methods, Mezard and Parisi predicted [68] 
that the length of a minimum matching of A^ points would have the large A^ 
limit 7r^/12 when the dij are uniformily distributed in [0,1]. The numerical 
simulations confirm this value at the level of 0.05%. 

The consensus is thus that the cavity method gives exact results at large A^ 
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for all independent edge-lengths disorder ensembles. But for the physicist, 
this is not the only interest of the cavity method: even as an approximation, 
it is useful for understanding the effects of quenched disorder. For instance, 
one can ask [69] how bad is the factorization approximation when applied 
to the Euchdean TSP in d — 2. For that, we compare Krauth and Mezard's 
cavity prediction j3{2) = 0.7251... to the best estimate from numerical simu- 
lations [74,71] 0.7120 ± 0.0004. We see that in fact the prediction is quantita- 
tively good, and it turns out that this approximation becomes even better as 
the dimension of space d is increased. 



5.7 Remarks on the cavity approach and replica symmetry breaking. 

In some respects, the cavity method is complementary to the replica method, 
but both become unwieldy when replica symmetry is broken. In the case of the 
TSP, it turns out that only the cavity method has allowed a complete solution, 
but that model has no replica symmetry breaking. When replica symmetry 
breaking does arise, the situation is far more complex, and to date only models 
defined on graphs with infinite connectivity have been solved exactly (though 
not rigorously). Nevertheless, recent progress [75] in using the cavity method 
may soon lead to "exact" solutions of other models such as K-SAT in spite of 
the presence of replica symmetry breaking. 



6 Related topics and conclusion. 

6.1 Other optimization problems investigated in physics. 

This article has focused on presenting statistical physics tools in the context of 
a few well-known problems. But many other random combinatorial problems 
have been considered by physicists, often using nearly identical techniques 
to the ones we have presented. For the reader interested in having a more 
complete view of such work, we give here a partial list of problems and pointers 
to the litterature. 

Graph bipartitioning. 

Given a graph G, partition its N vertices into two sets of equal size. The cost 
of the partition is the number of edges connecting vertices in different sets. 
The graph bipartitioning (or graph bisection) problem consists in finding the 
minimum cost partition. 



61 



This problem is readily reformulated in the physics language of spins: to each 
vertex i attach a spin Si and set it to +1 if the vertex is assigned to the first 
set and —1 if it is assigned to the second set. Calling Gij the adjacency matrix 
of the graph G, the number of edges "crossing" the partition can be identified 
with an energy: 

E^WG^J{1-S^S,) . (134) 



Since the partition is assumed balanced, the global magnetization M — Y,iSi 
is constrained to be zero. In physics studies, researchers enforce this constraint 
in a soft way by adding to the energy E, where A is a positive param- 

eter. As a result, spins interact through effective couplings Jjj = {Gij — A)/2 
that can be positive or negative. The corresponding energy function is then 
seen to be a spin glass Hamiltonian, similar to the Sherrington-Kirpatrick 
model exposed in Section 2.4.2. The first authors to notice this identification 
were Fu and Anderson [76, 77]. They then applied the Parisi solution of the 
Sherrington-Kirpatrick model to give the large N value of the minimum cost 
partition when G has connectivities growing linearly with N. These results 
generalize to weighted graphs straightforwardly. 



Weighted minimum bipartite matching. 

Let / and J be two sets containing points each. We assume given an A^ x A^ 
matrix of "distances" dij defined for each pair i e I,j E J. For any complete 
matching (a one-to-one map or a pairing between / and J, more commonly 
known as a bipartite matching), its cost is defined as the sum of the distances 
between paired points. In the minimum weighted bipartite matching problem 
one is to find the complete matching of lowest cost. Naturally, one can consider 
a stochastic version where the entries of the distance matrix are independent 
random variables, drawn from a probability distribution p{d). This problem is 
close in its technical aspects to the stochastic TSP, and like the non-bipartite 
case it has been "solved" both via the replica and the cavity methods [68,78]. 
In the special case where p{d) is the uniform distribution in [0, 1], Mezard and 
Parisi hav computed the large N hmit of the typical cost to be 7r^/6. In fact, 
in a real tour de force, they also obtained the form of the 1/N correction to 
this limit. More recently, Parisi considered the special case p{d) = cxp(— rf) 
and conjectured [79] that for any A^ the mean minimum cost is given by 
J2k=i,...,N 1/^^- current evidence, both numerical and analytical for small 
A^ values [80], indicates that this formula at finite A^ could be exact. 
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Number partitioning. 

This problem can be motivated by the need to divide an estate between two 
inheritors in a fair way. It is usually formulated as follows. Let {xi, X2, x^v} 
be real numbers in [0, 1] and consider a partition of the Xi into two (un- 
balanced) sets. The "unfairness" of a partition is the sum of the x's in the first 
set minus the sum of the x's in the second. The number partitioning problem 
consists in determining the partition that minimizes the absolute value of 
the unfairness. When the independent random numbers, it is possible 

to derive some statistical properties of the minimum. We refer the reader to 
Mertens' detailed review in the present issue [4] of his recent work. 

Vertex cover 

Very recently, A. Hartmann and M. Weigt studied the minimum size of vertex 
coverings of random graphs. Phase transitions take place, accompanied by 
drastic changes of the computational complexity of finding optimal vertex 
coverings using branch-and-bound algorithms. See the article in the present 
volume [3]. 

Neural Networks. 

To a large extent, learning and generalization properties of formal neural net- 
works are optimization problems. These properties have been the subject of 
intense studies by statistical physicists in the last fifteen years. A quite com- 
plete review of these works and results are exposed in the article by A. Engel 
in this volume [5]. 

6.2 Further statistical properties. 

Statistical physics concepts and techniques arc powerful tools to investigate 
the properties of ground states, that is the solutions of combinatorial opti- 
mization problems. So far, we have concentrated on the large size (large 'W ) 
limit of these problems, but one can also consider finite N . In addition, it may 
be of interest to know the properties of the near-optimum solutions. 

Finite-size corrections and scaling. 

Mean-field models can be solved through saddle-point calculations in the in- 
finite size limit only. Clearly, optimization problems usually deal with a finite 
number of variables. It is therefore crucial to achieve a quantitative under- 
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standing of the finite size corrections to be expected, e.g., on the ground state 
energy. 

Far from phase transitions, corrections to the saddle-point value can usually 
be computed in a systematic way using perturbation theory. An example of 
such a calculation to determine finite-size corrections has been mentioned pre- 
viously (see the bipartite matching problem discussed in Section 6.1). For any 
quantity or "observable" associated with the optimum solution of a problem, 
one can ask how its disorder- average depends on the system size. Similarly, 
fluctuations, which disappear in the infinite volume limit, generally matter for 
finite sizes. Both effects are well-known in the statistical physics of systems 
without disordered interactions and have been the subject of many theoretical 
studies[81,82]. 

Close to transition points, the handling of finite-size corrections is much more 
involved. Few results are avalaible for disordered systems [83] . Generally speak- 
ing, the transition region is characterized by a window, the width of which 
scales as some negative power of the system size, shrinking to zero in the in- 
finite size hmit. We have already discussed the critical scaling properties of 
some systems in Sections 2.3.5 and 3.3.1. No similar theoretical study of crit- 
ical exponents has been performed so far for complex optimization problems, 
e.g. K-SAT; only numerical data or bounds on the exponents are currently 
available. 

Finite- dimensional energy landscapes and robustness. 

Reahstic physical systems and certain optimization problems such as the Trav- 
eling Salesman Problem live in a finite-dimensional world. Thus, although we 
considered in Section 3 a percolation model on a random graph, the physics 
of the problem is usually modeled using a lattice in two or three-dimensional 
space, edges joining vertices only if they are close in Euclidean space. Mod- 
els based on random graphs are considered to describe physical systems only 
when the dimension goes to oo. 

Finite-dimensionality may have dramatic consequences on some properties of 
the models; for instance it is known that the critical exponents depend on 
the dimension of the embedding space. More crucially, in low dimensions, the 
correct order parameter could be quite different from what it is in infinite 
dimension. This issue is particularly acute in the physics community in the 
case of spin glasses: so far, no consensus has been reached concerning the 
correct description of these systems in dimension 3. Two main theories exist: 

• Parisi 's hierarchical picture. This sophisticated theory comes from extend- 
ing mean-field theory to finite dimensional spaces. It states that low lying 
configurations, i.e. having an energy slightly larger than the ground state. 
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may be very far away, in the configuration space, from the ground state. 
These excited configurations are organized in a complex hierarchical fashion, 
in fact an ultrametric structure. 
• The droplet picture. Conversely, the droplet picture is based on simple scal- 
ing arguments inspired from ferromagnetic systems and claims that low- 
lying configTirations stand close to the ground state. Higher and higher en- 
ergy excitations will be obtained when fiipping more and more spins from 
the ground state. 

A detailed presentation of the theories can be found in [2]. Knowing which 
picture is actually correct could have deep consequences for dynamical issues 
(see the next paragraph), and also for the robustness of the ground state. For 
instance, it can be important from a practical point of view to know how much 
a perturbation or modification of the energy function affects the ground state 
properties. Consider in particular the problem of image reconstruction. Can 
a small change in the data modify macroscopically the reconstructed image? 
Within the droplet picture, the answer would be generally no, while Parisi's 
theory would support the view that disordered systems often have non-robust 
ground states. 



6.3 Perspectives. 

The study of the statistical properties of disordered systems has witnessed 
major advances in the last two decades, but the most recent trend has been 
towards trans-disciplinary applications. Although it is difficult to guess what 
new directions will emerge, there has been a clear and growing interest in 
using statistical physics tools for investigating problems at the heart of com- 
puter science. In this review, we illustrated this for decision and optimization 
problems, but many other problems should follow. Looking at the most re- 
cent work, we see emerging efforts to extend these methods to understand the 
statistical properties of the corresponding algorithms, be-they exact or heuris- 
tic. Let us first sketch these issues and then mention some further possible 
directions. 



Typical case computational complexity. 

The notion of typical case computational complexity is appealing, and statis- 
tical physics tools may help one understand how that kind of classification of 
decision problems may be reached. But clearly the methods needed to do so go 
much beyond what we have presented: partition functions and analogous tools 
describe the solutions of a problem, not how long it can take to find them. Nev- 
ertheless, as we mentioned in Section 4.7 in the context of the Davis-Putnam 
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tree search, physical arguments can shed new hght on how algorithms such as 
branch and bound behave near a phase transition. Thus these methods may 
tell us what is the typical computational complexity of an instance chosen at 
random in an ensemble, given a particular tree search algorithm. Extending 
this classification to obtain an algorithm-independent definition of typical case 
computational complexity may follow, but so far it remains largely open. 



Long time (stationary) limit of stochastic search algorithms. 

Consider heuristic algorithms that are based on stochastic search. Examples 
are simulated annealing, G-Walk, or deterministic limits of these such as lo- 
cal search. These kinds of algorithms define random walks, i.e., stochastic 
dynamics on a discrete space of solutions (boolean assignments for K-SAT, 
tours for the TSP, etc..) and these dynamics are "local": just a few vari- 
ables are changed at each time step. Assume for simplicity that the initial 
position of the walk is chosen at random. At long times, the search settles in 
a steady state where the distribution of energies becomes stationary, that is 
time-independent. (The energy at any given time is a random variable, de- 
pending on the starting point of the search and also on all the steps of the 
walk up to that time. The energy thus has a distribution when considering 
all initial positions and all possible walks.) An obvious question is whether 
this distribution becomes peaked in the large size limit. Indeed, in most cases, 
one can show that the energy of a random solution is self-averaging; note that 
this corresponds simply to the self- averaging property of the thermodynamic 
energy at infinite temperature. In fact, for the problems we have focused upon, 
the energy is expected to be self-averaging at all temperatures. By a not so 
bold extrapolation, one may conjecture that any local stochastic search algo- 
rithm leads to self- averaging energies in the long time limit. (Naturally, we 
also have to assume that the algorithms do not have too much memory; using 
a simulated annealing with temperatures changing periodically in time will 
not do!) There is numerical evidence [84] in favor of this conjecture, and it 
may be possible to use statistical physics methods to prove it in some limit- 
ing cases. One can also ask what is the limiting shape of the distribution of 
energies. This is a difficult question, but it may be easier in this context than 
when considering the optimum. 



Dynamics of stochastic search algorithms. 

Is the self-averaging behavior just mentioned restricted to long times? Since 
the initial energies are those of random solutions and are thus self-averaging, 
it is quite natural to generalize the conjecture to all times: "the energy at any 
given time of a local stochastic search algorithm is self- averaging" . 
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Quite a bit of intuition about this issue can be obtained by considering what 
happens by analogy with a physical system relaxing towards equilibrium. The 
main characteristic of the dynamics in a physical system is the property called 
detailed balance; this condition puts very stringent restrictions on the tran- 
sition probabilities. But within this specific framework, there has been much 
progress recently in describing the time dependence of the dynamical process. 
In particular, the conjecture introduced above is confirmed in the context of 
mean field p-spin glass models. The exact solution of these models has led to 
new results on entropy production while the phenomenon of "ageing" has been 
explained theoretically. Clearly an important goal is to extend these results 
to arbitrary stochastic dynamics without the hypothesis of detailed balance. 
But perhaps one of the most remarquable results coming from these studies 
(see for instance the contribution of Bouchaud et al. in[2]) is a relation be- 
tween the relaxation during these dynamics and the effects of a perturbation: 
the prediction, called the generalized fluctuation-dissipation relation, seems 
numerically to be quite general and it would be of major interest to test it in 
the context of more general stochastic dynamics. 



Further directions. 



We will be brief and just give a list of what we consider to be promising topics. 
First, just as the notion of computational complexity has to be generalized to 
a typical case description, the analogous generalization of approximability is 
of interest. In its stochastic or typical extension, an algorithm provides an e 
typical case approximation to a problem if with probability tending towards 1 
in the large size limit, its output is within e of the actual solution. Naturally 
results that hold in the worse case also hold stochastically, but one may expect 
new properties to hold in this generalized framework. Second, there has been 
an upsurge of interest in physics for combinatoric problems, using techniques 
from field theory and quantum gravity. The problems range from coloring 
graphs to enumerating meanders. Although the initial problem has no disor- 
der, the approaches use identities relating systems with disorder to systems 
without disorder that are as yet still in the conjectoral stage. Third, is there 
a relation between replica symmetry breaking and typical case complexity? 
Forth, will the statistical physics approaches in artificial neural networks and 
learning lead to new developments in artificial intelligence? Fifth, an active 
subject of study in decision science concerns "belief propagation" algorithms 
which are extensions of the cavity method. Can these extensions lead to bet- 
ter understanding of physical systems, and inversely, will the use of physics 
concepts such as temperature, mean field, scaling, and universality continue 
to lead to improved algorithms in practice? 
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A Answers to Exercises 



A.l Exercise 1: System with two spins and statistical independence. 



The partition function (3) at temperature T = reads 



(Tl,(T2 = ±l 

= Yl exp(/3aiC72)) 

CTl,(T2=±l 

= 4 cosh /3 . (A.l) 

The magnetization m{T) and the average vahie of the energy {E)t can be 
computed from the knowledge of Z, see (8). One obtains 

m(T) = ((7i)t = , (A.2) 



and 

(E)t = - tanh(/5) . (A.3) 

The magnetization vanishes since any configuration {ai,a2} has the same 
statistical weight as its opposite, {— ui, — 172}. 

These calculations can be repeated for the second choice of the energy function, 
E{ai, (T2) — — (Ji — (J2, with the following results: 



Z{T) = {2 cosh I3f 
m{T) — tanh [3 

(£;)r = -2 tanh(/3) . (A.4) 

We see that the partition function is the square or the single spin partition 
function. The magnetization and the energy (once divided by the number of 
spins) are equal to the ones of a single spin, see expression (4). 

This coincidence is a direct consequence of the additivity property of the 
energy. More precisely, whenever the energy of a system can be written as the 

sum of two (or more) energies of disjoint subsystems, le., involving disjoint 
configuration variables, the partition function is simply the product of the 
subsystems partition functions. Such disjoint subsystems do not interact and 
are statistically independent. 
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A. 2 Exercise 2: Zero temperature energy and entropy. 

Let us suppose that the configurations C form a discrete set. Let us call Eq 
the smallest energy and A^o the number of configurations having this energy. 
Similarly we call Ei the immediately higher value of energy, with degeneracy 
Ni. This process can be repeated for more and more excited energies. At the 
end, configurations are sorted according to their energies with Eq < Ei < 
E2<.... 

From the definition (3) of the partition function, we write 

j>0 

= e-^^° (A^o + A^ie-^^i+A^2e-^^2 + ...) , (A.5) 

where Gj — E.j — Eq is the gap between the j'*'* excited energy and the minimal 
one. By construction, all gaps Gj are strictly positive {j > 1). Thus, in the 
small temperature (large (3) limit, we obtain 

Z(T) = Aroe-^^°(l + o(e-'^^^)) , (A.6) 

from which we deduce the free-energy, 

F{T)^-TlnZ{T)^Eo-TlnNo + o(^e-'^^A . (A.7) 



From the definition of entropy (11), it appears that the zero temperature 
entropy {S)t=o is simply the logarithm of the number of absolute minima of 
the energy function E{G). 



A. 3 Exercise 3: Spins on the complete graph in the presence of a field. 

The calculations are immediate from (25). The only difference is that, in the 
presence of a small but non zero field /i, the two minima of the free-energy 
shown on figure 2 are now at two different heights. One of the two minima 
(with the opposite sign of h) is exponentially suppressed with respect to the 
other. 



70 



A. 4 Exercise 4: Quenched average. 



Using the results of Exercise 2, we write the partition function, magnetization 
and the average value of the energy, 

J) = 4 cosh(/3J) 
m(T, J) = 

{E)t{J) = - J tanh(/3J) . (A.8) 

All these statistical quantities depend on the quenched couphng J. 

Wc now average over the coupling J, with distribution p(J) on the support 
[ J_ ; J+] . We obtain for the quenched average magnetization and energy. 



m(r)=0 

1^ = - J dJ p{J) J tanh{(3J) . (A.9) 



j_ 



In the zero temperature limit, the spins align (respectively anti-align) onto 
each other if the coupling J is positive (resp. negative). The resulting ground 
state energy equals \J\. Averaging over the quenched coupling, we obtain 

J+ 

'{Ej^o = - J dJ piJ)\J\ . (A.IO) 



A. 5 Exercise 5: Frustrated triangle of spins. 

Both energies are even functions of the spins; the magnetization is thus always 
equal to zero. 

We first consider the energy function 

E{ai, (72, CTs) = -aia2 - aias - a2a3 . (A.ll) 

The partition function and the average value of the energy read respectively, 

Z(T) = 2e^^ + 6e-^ 

/E)t^ ■ (A.12) 
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In the zero temperature limit, the ground state energy and entropy are given 

by 

{E)t=o = — 3 

{S)T=o^ln2 . (A. 13) 

There are indeed two configurations with minimal energy; all their spins are 
aligned in the same direction. 

We now consider the energy function 

E{ai, (72, (73) = -(71(72 - (7i(73 + (72(73 . (A. 14) 

The partition function and the average value of the energy now read respec- 
tively. 



Z(T) = 6 + 2 e"^^ 

In the zero temperature limit, the ground state energy and entropy are given 
by 



{E)t=o = —1 

(^)y^o = ln6 . (A. 16) 

As a result of frustration, the ground state energy is higher than in the previous 
case, as well as the number of ground states. Note also that the gap between 
the lowest and second lowest energy levels has become smaller. 



A. 6 Exercise 6: Partition function of the Sherrington- Kirkpatrick model. 

The partition function of the Sherrington-Kirkpatrick (SK) model reads 

Z(J)= ^ exp(^^J,,(7,(7,) , (A.17) 

where the quenched couplings J = {Jij,l <i<j < N} are randomly drawn 
from the Gaussian distribution 

nj)= n ^^^p(-^4) ■ (A-18) 
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To compute the average value of the partition function, we first average the 
couphngs out and only then calculate the sum over the spins 



Z{J) = J dJ P{J) Z{J) 



= E exp — E('^i^i)' 
= 2^exp(^(7V-l)] 



(A.19) 



We now calculate the second moment of the partition function by rewriting 
the squared sum as the product of two independent sums, see Exercise 1, 



z{jy = j dj p{j) z{jf 

^ f dJ P{J) E exp I ^ ^ Jij{(Ti(Tj + TiTj) 



/3' 



E E exp ^ E(^i^i + ^i^jf 

ai=±lTi=±l V^-'^ i<j 



where Y equals 



(A.20) 



E E expf^E^i^: 



CTi = ±l Ti=±l 



N 



i<3 



<JiTi 



(A.21) 



The calculation proceeds as in the case of the spin model on the complete 
graph, see section 2.3. We define for each configuration C — {ui, Tj} of the 2A'" 
spins, the overlap function 



1 



N 



i=l 



(A.22) 



The effective energy function appearing in the last term of the pseudo partition 
function Y (A.21) depends on the configuration through q{C) only. Following 
the steps of section 2.3, a saddle-point calculation leads to the asymptotic 
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behaviour of Y, 

Y = exp (^-N(3^(l)* + o{N)) 



(A.23) 



where 0* is the minimum over q of the "free-energy" functional f{q) defined 
in (25) with instead of T. The results of section 2.3 teach us that there is 
a "critical" temperature Tc — 1 such that (f)* — for temperatures above Tc 
and (j)* <0 when T < T^. 

Above Tc, the partition function does not fluctuate too much around the av- 
erage value Z{J); the partition function is itself self-averaging and the free- 
energy per spin simply equals /(T) = —Tin 2 — 1/(4T), see the paper by M. 
Talagrand in the same volume. At low temperatures, below Tc, the second mo- 
ment oi Z{J) is exponentially larger than the squared average; there are huge 
fluctuations and the partition function is not self-averaging. It is therefore 
much more complicated to calculate the value of the free-energy. 



A. 7 Exercise 7: A toy replica calculation. 

We want to compute the series expansion oiln{l+x) starting from the identity 
(for small real n) 

{1 + xy = l + n hi{l + x) + 0{n^) , (A.24) 



and the series expansion of (1 -|- x)" for integer n. To do so, we use Newton's 
binomial formula 



(l + ^r-E .„ ' > (A-25) 



valid for positive integers n. n play two roles in formula (A. 25). First, it is 
the upper bound of the sum over k. Secondly, n appears in the combinatorial 
factor in the sum. Factorials may be continued analytically to real values of n 
using Euler's Gamma function. As r(z) has poles at negative integer values of 
the argument z, we may extend the sum in expression (A. 25) to integer values 
of k larger than n without changing the final result. 
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Let us focus now on the combinatorial factor 



For A; = 0, we have C{n, 0) = 1 for all n. When A; > 1, the r.h.s. of (A.27) is a 
polynomial of n and can be immediately continued to real n. In the small n 
limit, we obtain 

C(n, k)=n ^ /.•••^ ^ + o(n) = n ^ + o(n) (k > 1) .(A.28) 

k\ k 

Finally, we write the small n continuation of equation (A. 25) as 

(1 + x)" = 1 + n 5] x'' + o{n) . (A.29) 

k=i ^ 



Comparing equation (A. 24) and (A.29), we obtain the correct result 

ln(l + x) = J2 ^ x'' . (A.30) 

k=i ^ 



The above calculation is a simple application of the replica trick. Obviously, 
the calculation of the free-energy of disordered models, e.g. the K-Satisfiability 
or the TSP models, are much more involved from a technical point of view. 
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